Jul 12

Nobody has given up on Linked Data

In light of the news that Talis Systems is suspending it’s investment in a generic semantic web platform and it’s semantic web consulting business, I wanted to explore why I think you shouldn’t draw gloomy conclusions for Linked Data and the web of data.

But first a short aside to illustrate what I think is one of the core differences between a graph based approach and a relational database approach to building applications.

When you try and think about data as anything other than describing the way the world looks and works, you have to make compromises in your view of the way the world looks and works. The biggest compromise trap you will likely fall into is making the assumption that everything that you don’t know about doesn’t exist so therefore cannot be true.

Faced with such a closed outlook on life you are going to find it really difficult to react to challenges that force you to accept that your closed world is a bit bigger than you thought it was. One of those difficulties is deciding whether to live with the status quo, or spend time and effort re-writing software to make use of this newly extended view of the world.

So don’t do that.

The core concepts of Linked Data and the web of data allow you to build a view of your world, described by your data, which your applications can then feed off. Your applications become either parasitic or symbiotic depending on whether they are purely consuming data or consuming the data to generate some new insight which is fed back into the system as new data.

Because one of the assumptions of Linked Data is that there is other stuff that you don’t know about, you have to build you applications to also assume that there is stuff that it doesn’t know. Your application can become more aware of the types of data it is dealing with, and recognise data patterns that it knows how to display or work with. This is a more organic way of designing applications that to my mind feels more natural.

This is just one of the ways in which organisations like Talis have changed the way they build software. So just because the economics didn’t work out for Talis in building a generic semantic web solution, it doesn’t mean that the learnings we have made over the last few years don’t apply to your specific problem area.

Jul 12

Too early, too slowly

You’ll have heard the news. Talis Systems is being wound down. The considerable investment that Talis have made in fostering the vision of the web of data has resulted in notable successes with notable organisations such as the BBC, the Ordnance Survey and the British Library. I’m proud to have been a part of the word-class consulting team that helped to get these organisations to a point where they could see the benefits of and join the web of data vision. However, the commercial realities of a small organisation working in a market that is growing at too slow a rate meant that we could not sustain the required level of investment.

For the last year and a half I have been talking to organisations about how they use data and how they want to take steps to make that data more openly available. This is worthy stuff, but for most organisations this is also experimental stuff. Some were more wiling to go for the ride than others, but even those organisations baulked at changing everything all at once.

And that’s not surprising.

I’ve written before about how graph thinking and open world assumptions make you approach a project in different ways. Some organisations are not ready to do that.  They feel that the change to a more open approach challenges their existing revenue streams.  Yet they don’t see that the people who currently pay for data in its existing form will continue to do so for some time because they too are resistant to change. However, there will come a time when those customers will also feel they need to change the way they do things. If you want to protect your current revenue stream you have got to explore some new opportunities so that you are ready when your existing customers move to a new technology.

But still, as employees working in an uncertain climate, not everyone is willing to risk their own standing within an organisation. Especially when you look outside that organisation and you see unemployment, recruitment freezes and belt tightening.

Which brings me to the factor in winding Talis Systems down which couldn’t be foreseen, and that is one of pure timing. A general attitude of wariness has emerged as a result of the current economic climate; the Queen’s jubilee and a longer royal wedding holiday pushed spending decisions into the following months; a typically apathetic month is usually expected in August when everyone is on holiday and a decision freeze for the Olympic games all served to make people put off purchasing decisions, especially where those decisions related to experimental projects. We can’t afford for everyone to wait until September to decide to do something.

This indecision is one of the indicators of a slow moving market that is not ready for commercial exploitation by a small business.

So we were too early. We had a vision for easy data flow into and out of organisations, where everyone can find what they need in the form that they need it through the use of linked data and APIs, and where those data streams could be monetized and data layers could add value to your datasets. But for many, the vision simply seems to be more of a dream. The difference between a dream and a vision is that a dream is more fantastical, while a vision is a practical goal. I think we had the vision, but others saw it as a dream, something unobtainable, something floating around within the cloud computing, big data, semantic web and linked data marketing spiel.

Other organisations besides Talis, sharing similar visions, have all had to change the way they present themselves as they realise that the market is simply not ready for something so new.

I have had the privilege over the last few years to work with many very smart people, both within Talis and within the organisations who engaged us as consultants and providers of a service around their data. I wish we could have gone on for longer, but sensible business decisions are made and have to be stuck to.

I look forward to sharing with you what I’m going to do next.

As a footnote: I should point out that Talis Education Ltd are still going strong with the Talis Aspire reading list management tool which is used by around 25% of UK universities and by similar organisations internationally. This is where Talis Group investment will be focused.

May 12

Why you should learn [code]

This was going to be a rather long biographical post about how I learnt to code. But I deleted all the boring stuff and I leave you with this reusable nugget…

Simply replace [code] with another subject.

Learning [code] was my way of solving a problem. If your problem can be solved by [code], then learn it. If not, don’t.  Don’t learn [code] just because you think you should learn [code].  Don’t let anyone make you learn [code] if it isn’t going to help you solve a problem.

Now go and reform the education system so that we teach our children how to choose the right tool to solve a problem, and not how to pass tests.

May 12

Working the internal data universe

When I started first writing code, I wrote in a typically haphazard way. (actually I still do this, but now have half an eye on well formed code style)

But what I definitely used to do was think about how to store persistent data that the application needed in a relational database backend. I wrote a CMS from scratch to host this website on (long since decommissioned), and a system for working out how much parents of nursery children owed based on the time their children had been in the nursery compared to contracted time (there were penalties for late collection). Both of these web apps required persistant data in the backend, and at the time my first recourse to a data store was a relational database.

Working with a relational database didn’t sit well with my style of fast iteration to get a problem solved, then tidy up afterwards, to make sure that the code was workable and readable.

Having to constantly add columns to a table as new peices of information were required to be stored, for either optimisation or to support some new feature, was a pain. It broke code on more than one occasion, and created hard to pin down bugs that were usually the result of using a select statement that called for named columns that were now either renamed or moved or gone.

I wasn’t agreeing my schema up front, because I didn’t know exactly what data I would need to solve the problems. I admit I could have used some coding strategies that would have abstracted the database layer from the application layer, but even though I tried to implement MVC, it seemed to just create more code and dependancies.

Contrast that to how I think about solving a code problem today.

Graph thinking; graph databases; the graph; are all things I didn’t know about when I started writing code. Now, when I am thinking of solving a problem I think about exactly two things.

  • What data does my application use?
  • What do I need to show to the user?

The answer to the first question is not a database schema, nor is it something that will break my code if changed. It is a description of the data my app will interact with, but described as a graph so that I can just add more description as appropriate.

The answer to the second question, is that each bit of code knows about a small part of the graph that it needs in order to function correctly. Each function will know exactly which bits of the graph it needs, and how to get it. If the shape of the graph changes it will either flag an error if something critical is missing, or simply ignore the changes and continue working as before.

I said earlier that I didn’t agree my scheme up front because I didn’t know what data I would need to solve my problems. I realise now that if I had simply described the things my application would be interested in, I would have had more than enough data to get going with. I would also have been able to add new data back into the graph thus persisting the solutions to the problems I was solving.

But I couldn’t do that with a relational database because of the key difference between using relational databases and graph databases: I am no longer forced into using the assumption that the edge of my relational database is the edge of the world, and that anything outside of my database does not exist.

There are challenges to this way of working too, but I find it a far more natural way of working. I even find that new opportunities for data display or analysis become apparent as the data morphs into shape.

In short: working with graph data is more like working with the data universe that is inside my head. And that view of the world is exactly that, a view of the real world.

Mar 12

A letter from the Middle Ages

This post originally appeared on the Talis Consulting Blog

Well actually, not just one letter, but over a thousand letters from the middle ages.

Last weekend, the National Archives held a Hackathon in the reading room at Kew. Around 40 developers and interested people took data from the National Archives and played with it.  There were new mobile interfaces for the NRA discovery API; collections of tweets mined for the data and PDFs they contained; stats on historical participation in the olympics pulled from the archives and shown on interactive maps. In all it was a fun weekend with lots of smart people in the room and very quiet but rapid typing on keyboards to get something finished by the 4pm Sunday deadline.

Prizes were:

  • 1st – Jonathan Tweed and Kai En Ong (ably assisted by Michael Smethurst, Faith Mowbray and Paul Rissen). A hack that pulls out data surrounding people & places in documents tweeted by @ukwarcabinet (and which – for a hack – is beautifully presented!).
  • 2nd – Jamie Mahoney – Debtors & creditors dataset hack maps the most popular lenders & shows who’s borrowing from where – Show me the money.
  • 2nd – Tim Hodson – A hack showing who wrote to whom in the middle ages.
  • 3rd – Crystal and Steven Hirschorn – A hack showing participation in the Olympics on an interactive map.

You can read more about these entries on the National Archives blog.

I hope you’ll forgive my showing off of my joint second prize winning contribution to the pizza and jelly baby fuelled hack fest.

I took a suggestion from Paul Risson as a personal challenge, and started puling the data that I wanted into a new CSV file.  I then converted that CSV file to a rudimentary RDF based model of the letters and people that the data described.  I now had a graph dataset which captured – in the way only a graph can – the network of relationships between people who are corresponding. It was then a case of finding a suitable javascript library to render my graph as a visual and to allow people to find out about who wrote to whom without cluttering up the graph diagram.

Mar 12

Riding London

I find myself increasingly frustrated by the idea of having to use public transport especially when i have heavy bags to carry and know that i could do it much more easily by bike. So last night I spent extra effort to make sure that I could ride my bike across London today. My folder is a Birdy Blue, and as such has mudguards that are integral to the bikes stability when folded. They are also vulnerable and so have got somewhat damaged over the last two years. So much damage that the front is held together with black gaffa tape and the rear had split in half.

My new set of mudguards arrived last week, and so I would normally have spent my Saturday at the Wolverhampton bike shed fixing other people’s bikes, and then in our yard, fixing up my own. However this week I was bound for London to attend a hack day, so Saturday bike tinkering was out of the picture.

For a while I was evaluating the options for travel from Marylebone to Kew, and thought that I could probably do it via the underground fairly easily. But then I thought of the bag of stuff I’d be carrying and how I would have to lug it around the underground stations. The wheeled bag is a drag (literally) and doesn’t fit onto the front rack of a boris bike, and a backpack is out of the question because of the weight.

I kept looking at the Birdy and realised that I wasn’t going to be happy on this journey without it. Therefore I set to and after removing two stubbornly steel-to-aluminium bonded screws with the aid of a drill, I am now enjoying the prospect of cycling to Richmond on a gloriously sunny London day.

Cycling is my favourite transport.

Feb 12

Critical Mass, February 2012

If you are into cycling in any way, you might like the opportunity for a friendly cycle round Birmingham.  Birmingham has a bit of a car centric attitude toward town planning, so it is great to be able to cycle around the center en mass and be a very visible part of the evening traffic.  Oh and we drop in at a pub on the way home. What’s not to like?

Here is a short video of last weeks Critical Mass.

Critical Mass, Birmingham, February 2012 from Tim Hodson on Vimeo.

A Critical Mass ride around central Birmingham. It was cold, getting close to if not below zero, but still a good ride.

Meet at Pigeon Poo park (by St Philips, Temple Row) for an 18:30 start.

Jan 12

Ubuntu, Copying Partitions and UUIDs

After some rather unsettling moments when everything in RAM kept running, but the root filesystem quietly disappeared, I decided to clone the root partition onto a second drive and boot from that. On investigation it looked like the first hard drive – /dev/sda –  had been hitting a max temprature of over 130°C.

The thing that was really puzzling me was how to tell Grub 2 that it should use /dev/sdc6 as the new root partition. I followed several sets of instructions, but no matter what I did, it always chose /dev/sda6 as the root partition.

I then tried using the excellent boot-repair disk to see if that could do what I wanted.  I ticked the option I wanted which was to use /dev/sdc6 a the default boot partition.  I applied the changes and still it booted from the wrong partition.  boot-repair sends a clear and well thought-out report to paste.ubuntu.com, and while looking through this I happened to note that the cloned partition had the same UUID (Universally Unique IDentifier) as the original partition.

light bulb!

I figured that this must be causing the confusion, so a quick google pointed me to this post by Paul Goscicki, which confirmed that it was a likely cause.  So on running tune2fs -U random /dev/sdc6 , and then re-running boot-repair, I now have a system booting from the correct partition.

UUIDs are obviously not always UUIDs!

Dec 11

Kasabi.com’s Developer Advocate

Since the beginning of this week, I am officially now working for Kasabi.com.  My role here is to find ways to make the adoption of Kasabi.com’s suite of datasets and easy to consume APIs even easier for developers who need a quick way to get at the data that powers their apps.

You probably already know that the core data hosting and ready-provisioned APIs of Kasabi datasets allow you to get a data backed application out of the door much more quickly than you would if you had to incorporate data hosting and access infrastructure within your project.

We want to concentrate on helping you build something fantastic that is compelling for your user groups.  We’re going to be doing this in several ways; by improving our documentation; by holding webinars to demonstrate key features; by running hack days to give you space to play and learn how to effectively work with the Kasabi APIs in order to get the data you want.

This is an exciting time.  We are building something that feels like its moment has come.

Dec 11

Cycle safety: A longer amber phase for traffic lights

While reading the DfT’s publication of a report on Infrastructure and Cycling Safety, it occurred to me that increasing the length of the pre-green red/amber phase at traffic lights, and allowing cyclists to cross the junction in this phase could be a relatively cheap way to improve the visibility and reduce the risk of collision by cars.

What do you think?

What are the pros and cons?