Sharing Best Data and Project Management Practices for Organising Data and Avoiding Digital Clutter

Sharing Best Data and Project Management Practices for Organising Data and Avoiding Digital Clutter

4 February 2021

This week we had a conversation to share best practices for organizing data, project management, and avoiding digital clutter. If you, like many of us, are struggling with these issues this blog doesn’t promise to have all the answers, but it lists all of the advice, tips, and resources people offered and maybe it will give you some ideas.

Before the session, an article was shared that people found to be very helpful with useful, practical advice:

Wilson, G. et al. (2017) ‘Good enough practices in scientific computing’, PLOS Computational Biology, 13(6), p. e1005510. https://doi.org/10.1371/journal.pcbi.1005510

If you find that article helpful, then you may also want to look at this one by the same team:

Wilson G, Aruliah DA, Brown CT, Chue Hong NP, Davis M, Guy RT, et al. (2014) Best Practices for Scientific Computing. PLoS Biol 12(1): e1001745. https://doi.org/10.1371/journal.pbio.1001745

Here are the advice, tips, and resources from our conversation:

  • Name files with a date in the format YYYY MM DD followed by a short description is useful, because it allows people to find the data by date or by description.
    • Keep the descriptions as short as possible
    • DO NOT use special characters (? ß \ ö ä ü)
    • DO NOT use spaces
                (Not all software supports files with special characters or spaces in the names)
  • Store files in a single location. Avoid the temptation to have multiple copies for any reason.
    • One way of doing this is to adopt a Wiki style system of organization
    • Another, similar method, is to convert your file system to a website-type structure and to make use of hypertext in documentation
    • An easy way to avoid multiple copies is to create a shortcut to the file instead of copying the file
    • To identify and clean up existing duplicates, Mac users might find Washing Machine useful, but there are also tools for Windows Machines like a free trial of Beyond Compare
  • Keep a log of everything you do, changes you make, and analyses conducted
    • Manual documentation in a log book or similar document, even an Excel tab seems is one option
      • Engineers get trained to keep a log of everything and maintain log books, so that if something happens in the future and they are inspected, they can say exactly what was done.
      • Wilson (2017) highlights that it is time consuming, but worth it in the end.
    • Git is highly recommended
  • Keep different versions of things organized
    • Use Archive folders to store older versions of things
    • Sharepoint and OneDrive both have version control
    • Git is useful for this, as well
  • When collaborating with others on an article or manuscript there are a few ways to keep track of versions
    • Sharepoint and OneDrive are both useful for this
      • But if collaborating in real-time using OneDrive, using it in a browser is recommended
    • A less elegant solution that people use is to email back and forth, adding initials to the file name to indicate changes.
      • For example, person AB might initially draft manuscript.docx. They email it to collaborators CD and EF. Collaborator CD makes changes, renames the document Manuscript_CD.docx then emails it to AB and EF. EF makes changes, renames the document Manuscript_CD_EF.docx and so on so that by the end there might be a dozen set of initials at the end of the file names, added as everyone takes turns editing and modifying the manuscript.
    • Google Docs is another solution for collaboration
  • The best way to avoid digital clutter is to minimize the creation of it in the first place.
    • Minimize by having data management and organization plans from the start that avoids duplication of files.
      • If you are writing a grant application, there are people who can help you develop a robust data management plan. Ask for help!
    • When to delete the archive/old versions?
      • For small files and projects it is easiest to just keep everything forever
      • For large files and projects….???

 

Some final notes:

  • If you know of some useful, popular software not currently licensed by the university, it is possible to apply for the university/CIS to acquire it
  • Some training is available through Advanced Research Computing (ARC) and DCAD, but suggestions for additional courses are welcome!

 

------------------------------------

This blog expresses the author's views and interpretation of comments made during the conversation. Any errors or omissions are the author's.

Michelle de Gruchy is a postdoctoral research assistant on the Climate, Landscape, Settlement and Society (CLaSS) Project in the Archaeology Department.



Comments

  1. Caveat: The advice to 'Store files in a single location' is not advice against backing up data. Always back up your data!

    ReplyDelete
  2. Beyond Compare 4.4.0 Crack
    Also, you can compare files extensively with a byte-for-byte comparison technique.

    ReplyDelete
  3. I must state that the blog post is really beneficial to anyone else who reads it because the information and knowledge it includes is vital. Continue to provide such useful knowledge through your posts and keep posting more on
    Big Data Solutions
    Business Analytics Services
    Data Modernization Services
    AI Service Provider

    ReplyDelete

Post a Comment

Popular posts from this blog

Business Analytics and Machine Learning Applications – Reflections from Durham Analytics Day

Quality Control and Software Development Best Practices

On Complex Open Systems