Picture of Oregon Sign

Workshop on field studies of data and software work in science

 

The google doc for our notes.

The short presentations and discussions sessions will focus on these questions (collaboratively developed via our mailing list.

  1. In what respects do data and software serve common functions in science?
  2. In what respects do data and software serve competing functions in science?
  3. In studying software practices, how should the role of data practices be considered?
  4. In studying data practices, how should the role of software practices be considered?
  5. What actions and practices on the part of data collectors, curators, and subsequent users, lead to successful data sharing?
  6. What actions and practices on the part of data developers, repository owners, and subsequent users, lead to successful software sharing?
  7. How can the usefulness of data be sustained over time? How can we anticipate its actual utility in order to determine how much to invest and what actions to take to support sustainability? (Same question for software.)
  8. To what extent can lessons from data sustainability and sharing inform practices surrounding software, and vice versa?
  9. Pushing our discussion closer to the sciences, objects of study and phenomena: Data as about something, data as from somewhere, data as for something (think: sites of collection, instruments, and data as standing in for some/manythings). Scientific software too is often about something, and it does something (think: modelling). In some cases, software is not intended to be about something (‘domain specific’) but a general analytic tool, which is interesting too.
  10. Pushing our discussion to consider the bigger picture by placing them in their institutional setting/making activities: The drive to archive, share and interoperate data, and the drive to systematize software, its production and accountability are both increasingly policy mandated. And we should consider the institutional actors, such as Sloan and NSF, who are shaping and funding a vision for data science, but also industries’ push/pull to reshape university curricula and educate a new workforce (something that universities are responding to with gusto, even if differently at each one).

  11. Lastly, thinking temporally:Neither software nor data are new, they have long heterogeneous lineages for their production, care, sharing and use, some of these long lineages have lock in or legacy effects; even new software and data are ‘born’ into these legacies. Thinking forward, rather than historically: many contemporary architectures will have legacy or lock in effects in the future too, some institutional actors are thinking specifically about that and enacting long term visions for the future of data, software, etc.

Some answers or factors that may be considered in the above questions: