I've had an interesting couple of days dealing with lots of people and lots of data. epiGenesys is in the middle of a project to replace workload modelling software used by two faculties in the University. The original software was developed a few years ago by students involved in Software Hut, and shortly after was enhanced by students from Genesys Solutions. As always things have moved on, and there is now a need to adapt the software to handle more data, and to enable it to be efficiently loaded with updated data on a regular basis. The outcome of this will be a combination of improving the usefulness of the outputs, and likely adding some automated analysis features, whilst reducing the administration work involved.
We are fortunate that last year we developed software to provide similar outputs, although using a very different approach and much less data, which is being trialled in a third faculty. From this we already had a good understanding of the key domains of data that would be of interest, and knew that all of it would already exist. Unfortunately we also knew that the data would be scattered between local sources in academic departments and across a number of central databases, using a variety of formats, and that nobody would have a detailed knowledge of all of the domains.
It was a straightforward decision for the project team that given the quantity of data involved, and our desire to reduce time spent entering data manually, it was vital that the new software could reliably import from a variety of sources with minimal user intervention. Our challenge is therefore to find people with detailed knowledge of the data, seek their help in finding the best source of the data, and then ensure it can be imported easily. The technical side of this has been relatively simple; the implementation of a reasonably generic import feature, capable of supporting various templates and running some custom pre-processing, has provided the flexibility needed. It probably won't surprise you to learn that the difficultly has been in the non-technical side of the work.
We're currently using a three-pronged approach that involves working with people in academic departments who use most of the data on a regular basis, with people in central services who in many cases are the data owners, and with people in CiCS who are familiar with the content of central databases. This is proving time consuming in some cases, and I've now become immune to the disappointment of hearing that someone is unable to provide the data we need. Fortunately everyone has been more than happy to suggest a colleague who might be able to help, or another data source that might contain what we need, and gradually progress is being made.
It's a little concerning that we're having to rely on our network of contacts so much for this project, although it was flagged up as a risk in the early stages, but on the other hand it's always nice to explore the organisation and meet new people. We're taking the opportunity to consolidate what we learn into the on-screen help within the software which will share the knowledge with people who can really benefit from it. I'm also wondering if we could draw a 'map of people and data' for future reference, although I expect it might look like a horribly complicated version of a tube map!