Skills development

A small group of us gathered in Bay 8 after lunch for a chat about skills development. Although small in number, we were a diverse group of software developers, researchers and students (from history, linguistics and computing/maths) and a librarian.

I proposed this session based on my observations that although a lot of effort seems to be going into building e-research infrastructure and tools, less effort is directed at ensuring that researchers (especially in the humanities) have the skills required to use them effectively to further their research. (I think this is partly due to the emphasis in current funding on infrastructure and not capability, but that does not explain the lack of embedding these skills into humanities curricula, preferably from Honours or even undergrad.)

Although it still seemed to be the case that digital humanities methods were not being widely incorporated into research training, there were lots of suggestions about how researchers could take matters into their own hands, from enrolling in formal training courses through to informal learning (e.g. open courses on the web) and learning-by-doing (the old ‘just have a play with it’ technique). Talking to colleagues from your discipline about what tools they use or know about was also agreed to be a good way to get started, so coming to a THATcamp should be on everyone’s list! There was also a discussion about the difference between learning about specific technologies vs developing a kind of ‘technical literacy’ that makes it possible to find out about, and move to, new tools and environments as technologies change.

There seemed to be a consensus that formal training would be of most benefit if it was targeted to the discipline or the needs of the researchers, and if the researchers had a strong sense of what it is was they wanted to achieve, whether that was discipline-specific or perhaps something more generic (e.g. web authoring, or administering a collaborative workspace for a research team). This might make it difficult for trainers, as both tech skills and some domain knowledge would be needed to really meet the needs of researchers.

The conversation then turned to a slightly different question: do humanities researchers need to know all this stuff themselves, or is what is missing a multi-disciplinary team approach, in which researchers and technologists could work alongside each other, each bringing their own bodies of knowledge to the mix? We then talked about stereotyping (seeing ‘IT people’ as an undifferentiated blob of helpdesk staff / service providers rather than partners and professional equals), and the need for people in business analyst roles that can ‘bridge’ the gap between the researchers and technology experts. These roles are emerging (e.g. through e-research units at institutions or on a state basis, like Intersect in NSW or QCIF in Qld) but are still not all that common.

Thanks to everyone who participated in the session. Please add a comment if you were there and would like to add or clarify anything!

Towards an XML-free future for the digital humanities

I was pleasantly surprised that my talk on the AustESE (Australian scholarly editing) infrastructure went down so well with the audience. Less surprising perhaps was their negative reaction to my suggestion that there might be a life for the digital humanities outside of XML. XML has for a long time (since 1998 at least) defined what the digital humanities were all about, and so to cultivate the creation of an alternative that would overcome its fundamental limitations may indeed seem like heresy. Not only does practically every tool in DH depend on XML (TEI Guidelines, XSLT, XQuery, XPath, Oxygen, etc.) but also the skills of digital humanists are based on those same technologies. To suggest that XML may not be the way forward seems to imply two unpalatable consequences:

  1. all the texts we have encoded so far may have to be redone
  2. all the tools we have developed on top of XML would have to be thrown away

This seems crazy, as well as heretical. But let me explain why I think it is not.

In answer to the first objection a fully-featured import facility would overcome any fears that encodings would have to be revised. The ability to ’round-trip’ the data back to XML (albeit with some loss) would also quell fears of ‘lock-in’ to a possibly unstable alternative.

In answer to the second objection, the skills of digital humanists and all other technicians evolve continuously. We are at the mercy of the software industry, and learn whatever tools they offer us to do our work. What I am suggesting is that we instead devise our own tools to do our specialised job far better. As an added bonus such a suite of tools would be under our control and not subject to commercial whims.

The industrial future of XML

XML was created by the W3C with help from Microsoft, who saw it as a way of implementing web-services. Messages would be passed from the client to the server about actions that the service could or would perform. Since then, the ‘bloated, opaque and insane complexity’ of XML web services as Tim Bray called them, has led many technologists to reject them in favour of a simpler noun-based methodology called REST. REST in a nutshell treats a service as if it was composed of static web-pages. ‘Get me this’, ‘here have that’ or ‘delete this’ etc is what REST services are all about. Although originally designed to work with XML, REST services are increasingly being crafted with pure JSON, a much simpler encoding strategy that is gaining some powerful advocates. How much longer programmers will support XML remains unknown; it’s very deeply entrenched. But that they will eventually replace it with something simpler can hardly be in doubt. And when they do, those tools on which we rely will cease to be maintained and thus will soon die. With Microsoft rapidly moving towards a predominantly mobile desktop metaphor based on JSON, HTML5 and Javascript, there seems no room for old-style ‘enterprisey’ XML in a future that is rushing towards us.


Getting started with the Trove API

In the morning session of the THATCamp Brisbane Hackathon, we explored the Trove API, first running through a tutorial and playing with some code to conduct searches using the official API; then looking at Tim Sherratt’s fantastic unofficial API and running some QueryPic searches; before getting somewhat distracted with discussion of the history of the Lamington.

We also briefly talked about geocoding: We discussed the Geoscience Australia Gazetteer Release, and then we looked at some examples of maps that were created using the automatic geocoding features of Google Fusion Tables.

Tech project metaphors: Ships & Waterfalls

Some stream-of-consciousness notes from a session held today at THATCamp Brisbane.  The below input comes from all those attending the session – if you’d like your name listed here let me know.

The topic was Tech Project Metaphors: Ships & Waterfalls

* Metaphors are almost tangible things in tech, since we deal with electronic impulses in silicon, we use terms like “thread” and “loop” to describe what we build.  Thus a metaphor is more than just an analogy, it is close to a reality.

* Good metaphors can help us deal with problems better.

* Metaphor “Build” software – this is an early metaphor.  Writing program text into an editor, compiling it and then delivering it as an artifact was called “building”.

* type 1 and type 2 projects – type 1 has no users/customers, and has no/low community engagement; type 2 has customers/users and high engagement.  engineers can think that type 1 projects are great because they do what they want without customers reporting problems.  business owners/customers think type 1 projects are disasters/white-elephants because they cost money but deliver no value.

* Metaphor “Waterfall” process – now we have a specification – rows of binders full of documentation about how the software should be.  Then we build what is described there, as before.  This is problematic for customer figures who don’t see working software until the waterfall has gone down through a few levels.

* Metaphor “Beehive” – software engineers are in a nice environment (the beehive) where they are cultivated and nourished by the beekeeper who pays the $$$ to keep them.  They produce software and the keeper scrapes the good stuff off the top.  Google’s environment is like this, but of course if the stuff is not good, it dies quickly so there is motivation.  The Google project “Knol” tried to be a better wikipedia, and it was a result of engineer driven requirements or beehive development – but because it didn’t meet the requirements of the Googling public it died; it was a type 1 not a type 2.

* Metaphor “Ship” – a journey, a team travelling and working together towards a common destination.  Different roles – those on the bridge steer the ship, those in the engine room burn cash to drive it forward.  On the bridge the team members need to set the course, plan waypoints and ensure they arrive at the correct destination (that is meet the customers requirements), while those in the engine room continue to drive the project forward by writing lines of code and putting implemented features behind them.  Need to iterate, and communicate between the two groups to be successful.  Clues and plans to steer the ship sucessfully are vital.

* From the ship we see procedural literacy can help those on the bridge to talk to the ones in the engine room.

* Pulling teeth needs to be done so that those who are not forthcoming with critical project requirement intelligence or information on risks (icebergs, reefs, unfavourable currents) do come up with that information.  Means getting the engineers to come up from below and take the long view to point out what may be obvious to them.

* Boundary rituals are useful in marking the passage.  Need to create ceremony around waypoints.

* Need to tokenise, gamify and make artifacts out of as much as possible in the project management so as to ensure that all those on the ship see the same things and understand the messages.

* The crew of the ship is an inter-disciplinary team.  Communication is a key skill for all of them, but especially for those on the bridge.

Key problems – how to promote, value and improve communication between the members of the crew?  How to bind them together so they feel part of the same “road trip” – not “us and them”?  How to recruit team members who have skills in communication, problem defninition and navigation?

Mash / hack at THATCamp Brisbane

Interested in mashing up data to produce interactive maps, timelines, visualisations, web or mobile apps?  THATCamp Brisbane will include a hackathon stream running all day (10:30 – 4:45) in Lab 2 at The Edge.

Got an idea for a mashup / app but need help building it? Or are you a developer with some expertise to share, a DH app to show off, or looking for new ideas and people to collaborate with?

We’ll be running tutorials throughout the day, as well as providing hands-on help to get started with building your own mashups and apps.

Topics that may be covered (depending on demand) include:

  • Cleaning up data sets using Google Refine
  • Creating data visualisations with Gephi
  • Data visualisation and mashup using Google Fusion Tables
  • Using Yahoo Pipes
  • Getting started with the Trove API
  • Introduction to programming
  • Creating mobile web apps
  • … or suggest your own topics!

Drop in any time during the day to join in the fun, but don’t forget to bring a laptop so you can take part.

Pop Up Linked Open Data – Libraries, Archives Museums (LODLAM)

What is linked open data? Well, that’s something many of us in the arts, humanities and GLAM sector are trying to get to grips with.

This post links to that of Roger’s on Digital ANZAC.  There’s a move in the linked open data – libraries, archives, museums around the globe to test out linked open data in diverse ways.  One of the areas where people are looking at to test out this new method of providing scholarly (arts and humanities) and GLAM collection data is around providing content online associated with the ANZACs and WWI (1914-1918) to align with the 100 year commemoration in 2014.  From what I can glean there are murmurs in the UK, NZ and also in OZ on that front.  See also the Europeana 1914-1918. Your Family History of World War One project that has (I think) helped to inspire this venture.

The Pop Up LODLAM that will be a part of the THATCamp Brisbane unconference at The Edge is an opportunity to learn from others, participate in discussions and share ideas about how linked open data might be applied practically.  There’s also the added incentive that there will be a LODLAM Challenge, which will award travel grants of up to $2,000 to up to five winning teams that enter their LODLAM project, toolset or prototype into the ring. These 5 teams will compete for a cash prize in an American Idol/Apprentice style pitch at the Summit to a panel of judges.


Maybe some participants at the THATCamp might have a super idea to pitch?  Details for the Summit should appear on the LODLAM website on September 1, including the application process.


Innovative Reading Interfaces

For pleasure, for self-improvement and for research we have used the book interface for a long, long, long time. How does a digital environment support alternative reading experiences? Electronic scholarly editions of literature, philosophy and theology have been frequently criticised for not moving beyond the model of the book. But what are the alternatives? Particularly in cases where a work has been revised and the authority of those revisions remain debatable, how can we better engage with complex conditions of textual and historical change?

I’d like to participate in a discussion that acknowledges the traditions and the efficient designs of books and considers the directions that interface design can go in order to move beyond the model of the book that electronic scholarly editions have replicated in the majority of cases. This probably goes for any type of e-book. How do we take advantage of the “e” and move beyond the “book”?

Digital ANZAC?

Thanks Sue Hutley for pointing out that the ANZAC centenary is fast approaching. Sue asks, “How can we make it easier for Australians (and everyone interested in the Anniversary 1914-1918) to connect with our collections of significance?”

That sounds like a session … or two! Digitisation projects at local, state, national and international levels will provide access to an enormous amount of material held in their archives. What sort of tools are necessary to support connection suitable for students, teachers, researchers, and the general public. What tools are already available? What should we think about building to prepare for the Centenary?

There could be scope for discussion and hacking with this subject. Session facilitators are welcome to volunteer before or on the day. Hackers might want to come prepared.

It’s worth noting here the recent Library Hack and Gov Hack competitions. The Edge provides the space and facilities for a day of collaborative hacking in the session streams. Contact us any way you like if you wish to coordinate:

Text Analysis, Anyone?

Thanks Jean McBain for pointing out an event of interest in Canberra (see below), containing themes that will be of interest at THATCamp Brisbane. In fact, Ian Wood was in attendance at last year’s THATCamp Canberra!

With online tools such as Voyant Tools and, perhaps, by drawing on a decent sized corpus with tools from the Wragge Labs Emporium, there is an opportunity to do some small-crowd text analysis. With so much text online there’s plenty to choose from.

If you’re in Canberra, try and get to Ian’s presentation. If you’re not in Canberra, come to THATCamp Brisbane and you’ll find plenty of people to talk to about text analysis.

Analysing Historical Texts and Blogs using Meandre & SEASR
Where: Room 10, Digital Humanities Hub, Building 101, 9 Liversidge Street, ANU
When: 10-11am, Friday 10 August
SEASR is an Andrew W. Mellon Foundation funded toolkit designed to enable researchers to rapidly design, build and share software applications that support research and collaboration. Meandre is the execution framework and graphical programming environment behind SEASR. SEASR and Meandre currently integrate an impressive array of text analysis tools.
Ian Wood will give a brief overview of Meandre and SEASR’s capabilities and share some experiences using Meandre for analysing historical texts and blog data.
Ian Wood is a PhD student in Computer Science currently attempting to develop text analysis tools that mimic traditional questionnaire based methods of measuring psychology. He has a Masters in Computer Science, which explores the potential for semantically enabled science publishing, and has been working with Dr Carolyn Strange on a project involving the analysis of a collection of historical newspaper articles, letters and novels. He is fascinated by the potential for modern text analysis in combination with voluminous social media data to provide empirically grounded insights into sociological processes.

Suggestions for Session Themes Welcome

If you’re wondering what to propose, or what sort of sessions might emerge on the day at THATCamp Brisbane, have a look at the web-sites of previous THATCamps in Australia.



Or, you can look at what has happened overseas by visiting THATCamp headquarters. There are plenty of ideas for Session Genres here:

Of course, there’s no need to replicate what has happened elsewhere. It is important to address the needs and interests of our own participants and something could emerge from events and experiences in the weeks or days that precede THATCamp Brisbane. Once you’ve registered, feel free to post suggestions as a blog entry. Once we have a few, we’ll collect these on a Google doc for all to see and add comments.