Posted by: Ken Fischer on: August 4, 2009
Opportunity: Spending of government money should have a purpose and that purpose should be for the benefit of someone whether directly or indirectly. The benefit might for an employee to work better and that employee might be working to benefit a group of citizens. The administration wishes to create a more transparent, effective and innovative government as well as to reduce the federal deficit. In order to do this, the administration must identify opportunities for innovation which can increase efficiency as well as decrease spending and make the case to the American people that it is making more effective use of taxpayer funds. I want to make the case here that linking spending data to benefits of that spending in ways which are detailed, clear and relevant to large numbers of citizens is the best way to find innovations to create a more effective government as well as to make transparency have meaning and value for the average citizen.
Challenge:
Approach: Identify, Find and Link Disparate Data Sources which can clarify the benefits of Government Expenditures
Datsets must be found which can connect government spending to both outcomes and benefits to people. For instance, compete.com provides data on how many visitors a website receives. Connecting the cost of a government website to the number of visitors it receives per year can give a cost per citizen served. Therefore getting the free data provided by Compete.com and linking it to the cost of a government website will provide more transparency and a clearly cost of the benefit provided. This can then be compared to other ways of providing that same benefit of information delivery.
Another example is connecting the expense of providing office furniture to a known number of employees in an agency can then make it clear, the cost of doing providing office support per employee which could be compared to private sector data.
Connecting government expenditures to their benefits and making clear the cost per beneciary in relevant ways can become a starting point for encouraging innovation to make a more effective government as well as to give the idea of government transparency meaning and value to largest number of people.
Case for Using the Resource Description Framework Or Linked Data Model:
While linking data can be done in many different ways, I do want to give a plug for the linked data model in this instance, because in the long term, I believe it is the best way to connect government spending with the benefits of that spending.
Of course connecting spending to benefits is not always as simple as the examples I gave, nor is the data easy to find and easy to connect. In fact you may need to link multiple datasets in a chain to get the benefit information in a way which is relevant and broadly understandable. The resource description framework or Linked Data model gives us a way to start to collect this kind of data in a distributed fashion without strict central control and does not even require it to be on the same server or system in order to be linkable. This makes RDF or Linked Data an ideal candidate to complete the long term vision of linking complex federal spending data with its outcome and benefits in a way which can have meaning for the average American Citizen.
Posted by: Ken Fischer on: June 2, 2009
Linked Data is now officially on its way to the commercial world. Google is creating the market incentive to drive it, one element at a time.
Google’s recent unveiling of google snippets and search options means its time to change how you display your products online. Your products and any reviews of your products by customers will soon be found if you have displayed in one of several formats. The technical name for these are RDFa or one of several microformats. Also using google, people can more easily look for reviews. Soon also they will be able to compare different products and extract specific information from your website using Google Squared.
What does this mean? It is the beginning of a change in how the underlying code of websites is written. It will mean that facts, social remarks or other categories of information which the consumer deems is important will be more than keywords. Keywords are not dead but they are now going to one element of the search equation rather than the only one. Google will likely determine if people find well formatted information more useful and if they do, it is likely those sites will start to appear at the top of searches.
Of course this will not happen overnight but this announcement and the fact that yahoo already is doing similar with Search Monkey means the race to a Linked Data web of meaning has now started and will proceed in a step by step fashion probably based on what meaning the consumers are searching for.
But yes its time to change how your products are displayed online and time to start to reconsider social content in your website. That is certain.
Posted by: Ken Fischer on: May 26, 2009
Tim Berners-Lee concept of linked data clearly is a way to make data more usable whether this is public data or data within a large enterprise. Linked data promises a future which makes related data more interoperable, discoverable and opens the door for innovation.
But how do we take large existing data stores and apply linked data principles to achieve these benefits? We currently have massive existing data stores with complex security regimes which are depended upon for many legacy applications. To make them available as Linked Data is a huge challenge especially if we were to recreate these data stores in XML syntax using RDF/RDFa or even simpler XML schemas. This is coupled with the fact that many of benefits of the reconstituted data have not yet been invented so an ROI argument cannot clearly be made. Of course, they haven’t been invented yet because while many can agree the data would be more usable, those uses must be discovered by fiddling with the data in linked form and discovering the uses that emerge. Since the linked form, doesn’t yet exist, we have the classic chicken in the egg problem.
Perhaps there is a step we can take toward linked data without making large changes to the existing data stores in government and industry. Let’s review the principles of Linked Data first (as paraphrased from wikipedia to add clarity):
The striking thing about these principles is that they don’t mention XML or RDFa etc but focus instead on linking data to definitions. So it would seem a hybrid solution between the linked data concept and existing databases is possible. We could add URIs as fields in existing databases for important elements and define a central location where we will track information about that element. For instance, in the US government there are lots of federal buildings used by multiple agencies. So I would assume many agencies have databases which refer to federal buildings. Why not establish a central location to define those buildings and assign each a URI. (A URI by the way is essentially a universal identifier for a real world object. Essentially it is a web page for each building, but the page would more like contain data links than nice pictures. (Oh and some people refer to URIs as URNs or Unique Resource Name in an effort to make them more human readable which is nice too) .
So each federal building would have a URI/URN and we could of course put more information about each building in a centrally defined schema, but that will start to be real work and have instant security issues. So why not initially just have URIs contain recipricol links to databases which also contain that identifier? The links would have brief non-security breaking descriptions of what type of data is stored in the database which is linked to. This would remove the need to re-securitize a lot of information to make it cross-department/cross-agency available. And here is the other key to success for this type of solution: Don’t require the back links to the databases to expose data unless they already do so. If we start requiring data to be exposed in this step, it opens up the security pandora’s box. We need to avoid imposing a new security regime for centralized data, because it is a stumbling block which would create delays and costs. And if people do not clearly see the benefits of this step, then it would simply die in committee in most cases.
So that is fine you say. We have URIs for important data elements and for databases which contain those elements but it is not exposing data, so where is the benefit? I think this stripped down version of linked data would have 4 definite benefits:
So now the readers of this blog are in at least 2 camps.
I am hoping there is a third camp which sees this as a doable step in large enterprises such as the US government. And that it would be the first step toward data which is more linked and therefore more usable for both public and internal uses, and eventually interoperable.
Let me know which camp you are in!