Web 2.0 Blog – Discovering Innovation Opportunities using Social Media

Opportunity:  Spending of government  money should have a purpose and that purpose should be for the benefit of someone whether directly or indirectly.  The benefit might for an employee to work better and that employee might be working to benefit a group of citizens. The administration wishes to create a more transparent, effective and innovative government as well as to reduce the federal deficit. In order to do this, the administration must identify opportunities for innovation which can increase efficiency as well as decrease spending and make the case to the American people that it is making more effective use of taxpayer funds.  I want to make the case here that linking spending data to benefits of that spending in ways which are detailed,  clear and relevant to large numbers of citizens is the best way to find innovations to create a more effective government as well as to make transparency have meaning and value for the average citizen.


  • Linking Spending to Benefits:  Federal spending is reported in ways which do not clearly connect it to the benefits that specific expenditures provide.  While certain dollar amounts may be reported as going toward ‘Defense’ that is not specific enough to understand whether a given expenditure is justifiable and doesn’t allow an expenditure of group expenditures to be compared to alternative solutions for the same specific benefit oriented goal.  Therefore we must find ways to better connect specific spending to specific benefits.
  • Benefits of expenditures must clear and relevant:  Benefits must be stated in ways which are relevant and understandable for a large number of citizens.  For example, a system which tracks resources in a government program is not relevant until it is connected to the benefits that program provides and to whom it provides those benefits.  Often times expenditures are reported as supportting a program, system or equipment but not clearly connected to an intermediary benefit it attempts to provide a person or to the outcome of the the program or equipment and its end beneficiaries.  What is relevant to the average citizen is not how systems support systems or programs support programs but how overall efforts affect people, in what way it affects those people, who those people are, and what is the cost of providing that benefit.    For instance in the case of a self-help kiosk at a federal office.  The relevant benefit is not how it supports the agency’s program but how many citizens does it serve, how frequently does it serve them, how well does it serve them and at what cost per citizen? 
  • Providing  Spending to Benefit visibiliy to a large audience will spur innovation.  Making the links between spending and outcome visible to a large audience is a critical step in identifying opportunities for innovation in government to increase government’s effectiveness.  Innovation comes from diverse people considering things in different ways (remember KIDFAD from Wisdom of the Crowds),  so making connections between spending and benefits broadly relevant and visible will provide the greatest opportunity for innovation in creating more effective means to achieve similar benefits.  Also innovation comes from novel approaches to address overal goals  so providing information on overall cost to an end benefit served to people provides the greatest opportunity to innovate other ways to provide the benefit.  If, for instance, you simply focused on the cost of gas for a truck to travel 1000 miles, rather than the benefit of transporting chairs on that truck, you might miss the opportunity to send it by train.   Of course if you focused on the goal of having chairs at  a location, you might notice that it might be cheaper to purchase them at the end destination rather than pack them up and transport them back and forth.
  • Meaningful Transparency.  Making the connection between benefits and spending in ways which the average citizen can understand and find relevant is required in order to achieve government transparency in a way in which transparency will have meaning for the average citizen.

Approach: Identify, Find and Link Disparate Data Sources which can clarify the benefits of Government Expenditures

              Datsets must be found which can connect government spending to both outcomes and benefits to people.  For instance, compete.com provides data on how many visitors a website receives.  Connecting the cost of a government website to the number of visitors it receives per year can give a cost per citizen served.  Therefore getting the free data provided by Compete.com and  linking it to the cost of a government website will provide more transparency and a clearly cost of the benefit provided.  This can then be compared to other ways of providing that same benefit of information delivery.   

Another example is connecting the expense of providing office furniture to a known number of employees in an agency can then make it clear, the cost of doing providing office support per employee which could be compared to private sector data.   

Connecting government expenditures to their benefits and making clear the cost per beneciary in relevant ways can become a starting point for encouraging innovation to make a more effective government as well as to give the idea of government transparency meaning and value to largest number of people. 

Case for Using the Resource Description Framework Or Linked Data Model:

While linking data can be done in many different ways,  I do want to give a plug for the linked data model in this instance, because in the long term, I believe it is the best way to connect government spending with the benefits of that spending. 

Of course connecting spending to benefits  is not always as simple as the examples I gave,  nor is the data easy to find and easy to connect.  In fact you may need to link multiple datasets in a chain to get the benefit information in a way which is relevant and broadly understandable.  The resource description framework or Linked Data model gives us a way to start to collect this kind of data in a distributed fashion without strict central control and does not even require it to be on the same server or system in order to be linkable.  This makes RDF or Linked Data an ideal candidate to complete the long term vision of linking complex federal spending data with its outcome and benefits in a way which can have meaning for the average American Citizen.

Linked Data is now officially on its way to the commercial world.  Google is creating the market incentive to drive it, one element at a time.

Google’s recent unveiling of google snippets and search options means its time to change how you display your products online.  Your products and any reviews of your products by customers will soon be found if you have displayed in one of several formats. The technical name for these are RDFa or one of several microformats.  Also using google, people can more easily look for reviews. Soon also they will be able to compare different products and extract specific information from your website using Google Squared.

What does this mean?  It is the beginning of a change in how the underlying code of websites is written.  It will mean that facts, social remarks  or other categories of information which the consumer deems is important will be more than keywords.  Keywords are not dead but they are now going to one element of the search equation rather than the only one.  Google will likely determine if people find well formatted information more useful and if they do, it is likely those sites will start to appear at the top of searches.

Of course this will not happen overnight but this announcement and the fact that yahoo already is doing similar with Search Monkey means the race to a Linked Data web of meaning has now started and will proceed in a step by step fashion probably based on what meaning the consumers are searching for.

But yes its time to change how your products are displayed online and time to start to reconsider social content in your website.  That is certain.

Postscript:  Another example of government spoofing was a prank cell phone call from India to the Pakistani Defense Minstry the day after the Mumbai terrorist attack.  The called claimed to be an Indian Defense Ministry Official and was claiming that India was going to retaliate. Planes went up in the air on both sides and the US had to intervene to prevent further escalation.  The call was taken seriously because normal authentication procedures were not followed or did not exist.

Hot off the press: Another spoofing incident which alleges civil damages involving Twitter the St. Louis Cardinals’ manager Tony La Russa.

While in general I dont think western Democracies have a lot to learn from the North Korean Government, I think in the case of Gov 2.0 spoofing there might be an exception.  The North Korean Central News Agency was recently impersonated on Twitter in a way which might have fooled a lot of people.  The twitter feed was made to look realistic because it used actual articles released by the Central News Agency. The prank was pulled off by a parody website called Stupidedia and they didn’t seem to intend to create any harm with it.

But this points out how easy it is to pretend you are an official government agency on twitter.  Recently I advocated for a simple reciprocal link authentication policy which would place a link on any official government web 2.0 account (twitter, facebook fan page etc) to a .gov or .mil page which would then give a link or list of links to the official social media account for that agency.  Then anyone could with 2 clicks verify that a social media account is authentically coming from an official government source.   As government presence becomes more common on social media, we will likely see more attempts to grab attention through this type of impersonation.  While it doesn’t seem like much could come of this, all it takes is one person believing one source is the voice of a government and acting on it to cause at the least embarassement and at the worse some harm.

Tim Berners-Lee concept of linked data clearly is a way to make data more usable whether this is public data or data within a large enterprise.   Linked data promises a future which makes related data more interoperable, discoverable and opens the door for innovation.

But how do we take large existing data stores and apply linked data principles to achieve these benefits?  We currently have massive existing data stores with complex security regimes which are depended upon for many legacy applications.   To make them available as Linked Data is a huge challenge especially if we were to recreate these data stores in XML syntax using RDF/RDFa or even simpler XML schemas.  This is coupled with the fact that many of benefits of the reconstituted data have not yet been invented so an ROI argument cannot clearly be made.  Of course, they haven’t been invented  yet because while many can agree the data would be more usable, those uses must be discovered by fiddling with the data in linked form and discovering the uses that emerge.  Since the linked form,  doesn’t yet exist, we have the classic chicken in the egg problem.

Perhaps there is a step we can take toward linked data without making large changes to the existing data stores in government and industry.  Let’s review the principles of Linked Data first (as paraphrased from wikipedia to add clarity):

  • Use URIs (Unique Resource Identifiers) to identify things that you expose to the Web as resources.
  • Use HTTP URIs so that people can locate and look up (dereference) these things.
  • Provide useful information about the resource when its URI is dereferenced.
  • Include links to other, related URIs in the exposed data as a means of improving information discovery on the Web.

The striking thing about these principles is that they don’t mention XML or RDFa etc but focus instead on linking data to definitions.  So it would seem a hybrid solution between the linked data concept and existing databases is possible.  We could add URIs as fields in existing databases for important elements and define a central location where we will track information about that element.  For instance, in the US government there are lots of federal buildings used by multiple agencies.  So I would assume many agencies have databases which refer to federal buildings.  Why not establish a central location to define those buildings and assign each a URI. (A URI by the way is essentially a universal identifier for a real world object.  Essentially it is a web page for each building, but the page would more like contain data links than nice pictures.  (Oh and some people refer to URIs as URNs or Unique Resource Name in an effort to make them more human readable which is nice too) .

So each federal building would have a URI/URN and we could of course put more information about each building in a centrally defined schema, but that will start to be real work and have instant security issues.  So why not initially just have URIs contain recipricol links to databases which also contain that identifier?  The links would have brief non-security breaking descriptions of what type of data is stored in the database which is linked to.    This would remove the need to re-securitize a lot of information to make it cross-department/cross-agency available.   And here is the other key to success for this type of solution: Don’t require the back links to the databases to expose data unless they already do so.   If we start requiring data to be exposed in this step,  it opens up the security pandora’s box.   We need to avoid imposing a new security regime for centralized data,  because it is a stumbling block which would create delays and costs.  And if people do not clearly see the benefits of this step, then it would simply die in committee in most cases.

So that is fine you say.  We have URIs for important data elements and for databases which contain those elements but it is not exposing data,  so where is the benefit?  I think this stripped down version of linked data would have 4 definite benefits:

  • Reference.  The URIs could serve as reference documents to find where similar information is stored. Users could then apply for security permissions on an as needed basis when they need to link to other databases.
  • Innovation.  Users, who would now have a more complete map of available data could be begin to suggest more uses for linking the data.
  • Discoverability.  Search engines (internal or external depending on the security decided upon for the URNs) could make existing databases more discoverable because the engines could discover  important data elements in the databases.  Search engines make use of links to discoverable relevance to searches and are often key to researching problems .
  • Interoperability.  The process of assigning URIs will begin to expose problems in data interoperability due to different definitions in different databases. The URI map would serve as a survey of issues in creating truly interoperable data.

So now the readers of this blog are in at least 2 camps.

  • Those who feel this is a half measure and would be a distraction from advocating for more completely linked data.
  • Those who are still not clear on the benefits of bothering to start the process of linking data at all.

I am hoping there is a third camp which sees this as a doable step in large enterprises such as the US government.  And that it would be the first step toward data which is more linked and therefore more usable for both public and internal uses, and eventually interoperable.

Let me know which camp you are in!

The future of the internet will involve more authentication than it does today but here is a potential interim solution to provide some level of authentication for Gov 2.0 presence on online social networks such as facebook and twitter. standard policy of having a reciprocal link back to a facebook fan page or twitter account on a .Gov/.Mil website which the social network page points to could be a simple interim solution. I call it Reciprocal Link Authentication.

Government 2.0 includes a government presence on non-government websites such as online social networks (OSNs) (think facebook fan pages and twitter accounts) so that citizen’s can encounter government guidance and assistance where they ‘live’ in cyberspace.  But how can citizens be certain that the government account/representative is authentic?    If you run into someone in the street and they say they are working for the government, how do you know for certain?  They provide you will a badge or ID right at the beginning of the conversation.

If we encounter government workers as official government representatives in non-government cyberspace, should we also be able to see some sort of identification?   Since cyberidentity is more easily assumable in many cases than aliases in real life (especially on social networks), shouldn’t there be a way to verify the authenticity of someone claiming to represent a government? Often times government officials on OSNs such as agency fan pages on facebook or informational twitter accounts will have an official seal or emblem. The problem with this is that it is trivial and relatively low-risk to copy or create an image of a seal or official looking emblem and put it on an anonymous OSN account compared to duplicating a paper credential which someone might show you in person.

The commercial solution for authentication won’t work on social network pages. Here’s why.

Commercial websites sometimes provide SSL encrypted links to independent authentication websites (Verisign, Godaddy, among others) to prove their authenticity.  The problem with the government using this method is that it would add paperwork and costs to implement SSL badges or require changes in existing online social networks profile options.  Also I don’t think there are products which work with OSNs and the authenticators to verify anyone on social networks yet.  Perhaps more importantly, the government would be then depending on a commercial company to prove its authenticity.  Basically it’s a non-starter if you want to actually achieve a Government 2.0 presence online in the near future for several reasons ranging from practicality to policy to politics to costs.

But wait, there may be a much easier and better way. .Gov and .Mil web sites already are monitored and checked for authenticity unlike .com and .org sites.   So you don’t need an independent cyber authenticator such as Verisign because any .Gov or .Mil site can serve as that authenticator.

Reciprocal Link Authentication.

Why not have a simple policy that any online social network account or non-.Gov/.Mil online presence have a link to a .Gov/.Mil webpage which then links back to that same OSN account?   So if someone wanted to verify a government twitter account, they could simple click on the URL provided and easily find a linkback to that same twitter account on the .Gov/.Mil webpage they landed on.  If the account is hijacked then a notice of the problem could be put up until the account identity is secured again.  If this is done on all federal OSN accounts, the cybercommunity will become quickly accustomed to the authentication method and if a hijacker removed the authentication link, the visitors will know to dismiss the account.  And if they see something which sounds a bit off, then can instantly verify it by following the link back to the OSN account.     It would not mean much work since online government representatives at non .Gov/.Mil sites almost always have some .Gov/.Mil landscape under their control.

Reciprocal Link Authentication seems easy, low cost and instantly provides a universal method to authenticate any online government representation without much effort.  Sure its not perfect from a cybersecurity point of view, buts it goes a long way to addressing several important concerns about government representation on non-government websites.

I noticed after writing this post that the underlying theme emerging from the fanciful thought droppings below is that it is best for the end user if data and applications are separate and interoperable.   The theme is starting to highlight for me the promise of semantic technology and open data standards.

I keep hearing will facebook win? Will google win? Will microsoft ever get out in the running? Will twitter be bought and by whom?  I wanted to offer another option.  Could the people win?

How would the people win?
Well what is a social network anyway? It’s a series of connections between people and it has rules for distributing information to people based on their connection.  Mutually agreed friends, followers and non-connected voyers following what you do and when you do it  as well as sharing with you.  The connecting and sharing  rules of the social network you choose determines what others see and, if you are up on the privacy settings, how you are connected with them.

Right now our choice is which networks to be on and we make that choice based on the connection rules, the type of content and interactions that can be had and where the people we want to connect with already are. As facebook or another network become more popular, it becomes more difficult not to choose it.

But we pay a price for choosing an online social network.
1. We have to accept the interface which is chosen for us. And while more customizations and widgets are coming out, the essential choice of interface is the control of the provider, not us.
2. We can’t choose our ideal mix. For instance what if we want a myspace style interface but with our facebook friends feed?   There are some configuration options available but still trying to match what we want with what is out there can be challenge.
3. We get targeted advertising based on our peronal information. Maybe we want it, maybe we don’t but at any rate we are not in full control of our information which gets mined for these ads.
4. We can’t move our information to another network or cross link to people in other networks. This is changing some but our information is still not in our control.
5. We can’t create our own rules for connection and viewing, we have to relay on a central authority to do this, even if they allow some flexibility. Very non-Web 2.0.

So how do we win?

What if instead of our data residing on a social network server, it resided on our own private space in the cloud?

And what if we could choose or even create the applications which would allow our data to be seen but others and with the rules which we decide on.  So we could use a facebook style application to interact with our friends but our friends wouldn’t have to be “on” facebook. They would simply have their own ‘cloud space’ and they could send twitter style updates back to us and not have to look at the vacation pics we just posted if they don’t want to.  But they could also choose to send some updates only to some people if they wanted, rather than having the choice tweet to all or tweet directly to one.  Basically the social network core of connections and activity of you and your friends could be managed by any number of applications and rule configuration more tailored to each individual. The way you want to interact with your friends and who your friends could be would not be determined by the popularity of a social network but by you.

Would this kill facebook or google?  Facebook would probably be the most popular application for people to choose to use to interact with their friends with and they could still get their ad revenue.  Google could provide the cloud space to host our data securely for free with ads or for a small cost as well as provide an interface application if you want it.

Twitter provides the first step in separating social data from the social application and it is good evidence of why this approach would be so popular.  I don’t mean the asynchronous relationships or the 140 character limit, but the fact that anyone can build a twitter application to interact with the “cloud space” of twitterfeeds.  Tweetdeck, tweetgrid, and many other twitter applications let people choose how to interact with their social connections and what their interface looks and feels like to some extent.  I am suggesting is widening this approach to include all of your personal information which you would want to potentially share and putting you back in control of your own information.

So you could have one interface for your immediate family, another window for friends and another for interesting people you follow or combination you choose.  Application vendors could make money through ads but you would choose who had a privacy policy on what those ads could find out about you.  Or you could choose to keep everything very private and pay for a service and place to keep your data.  This is similar to what people refer to as interoperability between networks but also with the twist of separating our peronsal data from the network itself.  So its more of an interoperable data model for social networking than an interoperable social network model.

Would this work?  Is part of a social network, the common rules and ways to connect which we are all are agreed upon?  If some people could stop sharing a lot of information except with their BFs, would the fabric of the social network be weakened and this whole idea result in a less networked world?    I don’t think it would because the culture has started to discover the benefits of sharing, but it’s definitely an open question.

So how do we get there?  Hmmm. Not sure.  Google’s free app engine could potentially power something like this. Something like a user rebellion which occurred when facebook tried to change its privacy policy a couple of months ago might be the start of an online privacy movement.  Right now people seem to be having too much fun though to worry about being in charge of their own information. Will this change?  I guess it depends what the social networks decide to do with all of our information that they have.

I decided to take my Wordle data set out for another spin and make google maps from each category.

Here are the maps. Hope you enjoy them!

There are not 100 questions in each map because some people did not provide valid US locations and a few questions were taken out for being off topic as described before.  The maps end up have 829 questions in 9 categories. Thanks to MapaList for the map tools and Ken Ward’s HTML guide for the javascript template.