How to document a summer academy: Difference between revisions

From Hackers & Designers
No edit summary
No edit summary
Line 240: Line 240:
       print pageUrl  
       print pageUrl  
       wikiJson = json.load(urllib2.urlopen(pageUrl))
       wikiJson = json.load(urllib2.urlopen(pageUrl))
       wikistr = ''
       wikistr = ''
       try:
       try:
         title = wikiJson['parse']['title'].encode('utf-8').strip()
         title = wikiJson['parse']['title'].encode('utf-8').strip()

Revision as of 11:02, 23 November 2015

Choosing the right technical workflow for your hybrid publication

by Vicky De Visser and James Bryan Graves


Content creation

Before planning on how to gather the documentation we had to ask ourselves: what is to be documented?

There are three different moment where content can be created and has to be documented:

Before the summer academy

Although the academy wasn’t started yet, there was already a lot of content that was generated by the organising staff. They did research, choose a subject, plan schedules and dates, get the word out by printed and online media and process motivations of applicants. It can be good to already document this process. By documenting the organisation flow you can create a base or structure for the upcoming summer school program and it is useful for sharing information about the future academy with applicants.

During the summer academy

During the summer academy different formats of content were provided to the students. Analyse and estimate this before your academy starts. This way you can start looking for a suitable platform where everything can be documented. It wil save you a lot of time after the summer school is finished. Organising this summer academy was our first time and we had to learn the hard way. It took us quiet some time to set up a good platform after the summer academy was over.

For collaborative note taking we experimented with Pirate Pad and Ether Pad. Both are online text editors where different users can take and edit notes in the same document. We really liked these editors.

Pictures were taken both by the organisation as participants. After the academy we had to email every participant several times to ask them to send their pictures. Using a platform that can be used during the academy can make it easier to upload the photo’s or other content at once.

After the summer academy

After the academy is finished, you can ask participants and teachers to write a review. This can be helpful feedback if you want to organise an other one the year after.

Choosing the right back-end to collect the documentation

To compile all the content we wanted to bundle co-writing, co-editing and content management all on the same platform. We looked for a back-end where we could do some real time documenting on, that was easy accessible, had a good way of ordering information, had a short learning curve for participants and supported several kinds of mediafiles. The lectures, workshops, screenings, excursions, party and exhibition provided a big variety of media such as videos, sound files, notes, images, and code as well as objects and long texts that wouldn’t be suitable for only an online publication.

We chose to use WikiMedia, which supports the several file formats and is easy editable by different users. It’s easy understandable structure and user guide is beneficial for changing collaborations and teams. This academy we didn’t have the opportunity to test WikiMedia as a back-end and content collector for students to upload their content as a WikiMedia page because we still had to built it. We used it after the academy to gather the information collected by writing emails and uploading code and pictures to a dropbox repository. During this filling of our WikiMedia back-end we encountered several hierarchy and workflow problems.

Categorisation of the content

We wanted to sort the documentation to be able to publish a well structured and logic documentation. We sorted the content by categorising the WikiMedia pages. Andre Castro recommended us to use a categorisation system sorted by state, media and topic.

Because not all media such as videos, sound files and links aren’t suitable for print publication we added a media category so we could divide articles in a suitable print version and not suitable print version.

It is important to consider that making a structure can inherit the danger of guiding the contributed content too much. Leave space for adding things that weren’t thought of before.

H&D Wiki categories structure:

State

  • EditMe
  • WriteMe
  • Published
  • Ready to be published

Media

  • Print
  • Web

Topic / Tags

  • eg Arduino
  • eg HTML 2 Print

Co-editing

After receiving the contributions, the organisation and participants must then select the most viable and implementable ones. The challenge of the selection process is that most submissions are not always useful, have an other hierarchy or are difficult to implement in the publication. Organisations have to deal with the submitted ideas in a very subtle way as throughout the process they don’t want to reject submissions and risk of alienating them which may eventually lead to disengagement. It’s advised to give the participants a short introduction in how to collect and document the content. This way the organisers can prevent an extensive editing process afterwards.

Wikimedia let’s users adjust articles and write comments on the changes that were made in an article. Participants have the possibility to go back in time and compare older versions of the same article. This way content will not get lost after over-editing an article.

Publishing

After building and filling our back-end documentation platform we had to translate the content to a website and print publication format. For print we looked into two options: creating a publication from the website and creating a publication directly from the WikiMedia page. The process will be discussed here.

Currently used set-up Hackers & Designers documentation platform.

Back-End WikiMedia Page
Front-End Website of Hackers & Designers
Print publication Generated from the wiki

Workflow organisation > preparation > communication > collaborative note taking and input assembly > editing > publication > printing > distribution  

Translation Wiki to website

Pandoc translates wiki-markup to the HTML markup language
Python Scripts Initiates several actions
Lay-out website CSS

Translation Wiki to print

Pandoc translates wiki markup to LaTeX mark-up
Python Scripts Initiates several actions
LaTeX Lay-outing of content from WikiMedia page and generating PDF

Publishing online

The Hackers & Designers website is published always in real-time. Meaning every time the website is loaded or pages are requested the web browser queries the MediaWiki back-end using the MediaWiki provided API, along with installed wiki semantic extensions and their APIs, through a Javascript client-side web application (AngularJS).


Example API Requests

Request a page:

   curl 'http://wiki.hackersanddesigners.nl/mediawiki/api.php?action=parse&page=How_to_document_a_summer_academy&format=json&disableeditsection=true&prop=wikitext%7Cimages%7Clinks'

Request an image:

   curl 'http://wiki.hackersanddesigners.nl/mediawiki/api.php?action=query&titles=File:Chicken-and-Potato-Soup.png&prop=imageinfo&&iiprop=url&format=json'

Code is available on Github.

Publishing print

1a. Web to print: HTML to Print with CSS and print preview

One way to publish a print publication from an online environment is printing a webpage by using CSS. A user can just click ‘Print Page’, a pop-up print preview will appear and the custom CSS will translate the webpage and media to a printable, in pages divided print publication. The design is also made with code/text, which means that you can use text editors such as Etherpad to collaborate on content with several people at the same time. The negative side on using this set-up is that you if you decide to not use a single page website you’ll have to print each article separate and later bind it into a book. This way no table of contents will be created or there will be no option to apply different lay outing for indexes, special chapters, or glossaries.

Differences Between CSS For The Web And CSS For Print


You also have to consider the differences between a webpage and a printed publication. What will happen with image galleries, sliders, menu’s and the like? If your webpage has a slideshow with ten images at the top, that’s not going to translate well to paper. The most basic level of interaction on the web is a link. This too becomes problematic. On your computer, you can simply click a link to see where it goes, on paper this functionality is lost so you need a good way to take all those inline links and show the reader where they lead.

The biggest difference, and conceptual shift, is that printed documents refer to a page model that is of a fixed size. Whereas on the web we are constantly reminded that we have no idea of the size of the viewport, in print the fixed size of each page has a bearing on everything that we do. Due to this fixed page size, we have to consider our document as a collection of pages, paged media, rather than the continuous media that is a web page. Paged media introduces concepts that make no sense on the web. For example, you need to be able to generate page numbers, put chapter titles in margins, break content appropriately in order that figures don’t become disassociated from their captions. You might need to create cross-references and footnotes, indexes and tables of content from your document. You could import the document into a desktop publishing package and create all of this by hand, however, the work would then need redoing the next time you update the copy. This is where CSS comes in, whose specifications are designed for use in creating paged media.

Print Preview

When you want to print your website a print preview window appears. The back-end code in the browser needs to do heavy lifting to support all the requests coming from the front-end.

The majority of the work in the back-end involves restructuring the HTML to support preview. Once all the platforms are brought in line, the printing pipeline needs to be broken up into two pieces: PDF generation and printing. The user has to select a printer before a print render is generated for a preview. When printing, the renderer generates printing metadata one page at a time, so it can generate them while the PrintJobWorker in the browser process sends them to the printer at the same time.


From CSS and HTML to print with Prince.

After you prepared your HTML and CSS for the website, and the layout for the book is set with CSS, Prince or another API can be used for translating the HTML language to a PDF.

Prince

You can download a free trial of the ‘Prince’ application and install it with ‘Terminal’. The Terminal runs ‘Prince’ and uses it to translate the website into a pdf. Prince is not an open source program but can be used as a trail. This means the front page of the pdf includes a logo of the prince software.


2. Wiki to print

Collection and the Offline Content Generator After doing some research we found out that WikiMedia had already developed an open source plug-in, called Collections created by The Wiki Publisher Project.The extension makes it possible to select pages to create pdf’s or compile books. The user can create different chapters and arrange articles. Then the compiled pages are send to the OCG (Offline Content Generator). The OCG invokes a /Bundler to spider the articles and fetch all images, stylesheets, etc required to render them. One of several /Backends are then invoked to convert the bundle into a PDF, ZIM file, or other appropriate output format. This convertion to a pdf happens on wiki servers. We wanted to use this open source plug-in and adjust the code to fit our needs. (eg layout, page size, content table,…)

For rendering on your own server (not using the wiki content generator) we installed the Offline Content Generator. This way we were able to change the layout of the output files. (pdf’s)

Offline content generator the package:

  • Service
  • Bundler: This tool grabs all the dependencies for a given set of articles and creates a directory or zip file.
  • Latexer

Latex

LaTex is a high-quality typesetting system; it includes features designed for the production of technical and scientific documentation. LaTeX is available as free language.

  • Several extensions are available for LaTex called: ‘packages’ or ‘styles’ (better image placement, implementation of mathematic formula,...)
  • Create a text file in LaTex mark-up, which LaTex reads to produce the final document.
  • User needs to know the commands for the LaTex mark-up
  • LaTex automatically adjusts fonts, text size, line heights or text flow.
  • LaTex is not flexible for lay outing, but you’re able to write your own macro’s or download them from others at CTAN
  • pdfTex: engine or pdf compiler. This converts the LaTeX mark-up to a pdf.

To be able to use the LaTeX mark-up, you have to install LaTeX distribution and an Editor. We used the TexMaker editor for Mac because this editor enables you to see quick previews of the Pdf file.

Bookshelf

This WikiMedia plug-in allows previous compiled books to be added to the bookshelf. The bookshelves can be used to print previously published editions of a publication by other users. This way we can compile one 'official' documentation book and let other people print it.

Pandoc

This program converts files from one mark-up language to another. In this case, we used it convert the WikiMedia mark-up to HTML for our website, and to LaTex mark-up for the print publication.

Mark-up

A mark-up language is a notation used to annotate a document’s content to give information regarding the structure of the text or instructions for how it is to be displayed.

Hackers & Designers Approach

Hackers & Designers through evaluating all the options available opted for using an Python script to collect MediaWiki content and then to use Pandoc in collaboration with a Latex template to generate a final PDF artifact.

Python script:

   # -*- coding: utf-8 -*-
   import json
   import urllib2
   import os
   import re
   
   wikiUrl = 'http://wiki.hackersanddesigners.nl/mediawiki/'
   
   f = open('handd-book.wiki', 'w')  
   pages = [
     'How_to_organize_a_summer_academy',
     ...
     'Credits'
   ]
   
   def get_image(filename):
     if os.path.exists(filename):
       return
     wikiJson = json.load(urllib2.urlopen(wikiUrl + 'api.php?action=query&titles=File:' + filename + '&prop=imageinfo&&iiprop=url&format=json'))
     print wikiJson
     try:
       pages = wikiJson['query']['pages']
       print pages
       for key, value in pages.iteritems():
         url = value['imageinfo'][0]['url']
         print url
         img_res = urllib2.urlopen(url)
         img_file = open(filename, 'wb')
         img_file.write(img_res.read())
         img_file.close()
     except Exception, e:
       print e
   
   for page in pages:
     pageUrl = wikiUrl + 'api.php?action=parse&page=' + page + '&format=json&disableeditsection=true&prop=wikitext|images|links' 
     print pageUrl 
     wikiJson = json.load(urllib2.urlopen(pageUrl))
     wikistr = ''
     try:
       title = wikiJson['parse']['title'].encode('utf-8').strip()
       print title
       wikistr += '\n\n=' + title + '=\n\n'
     except Exception, e:
       print e
   
     try:
       # Get images - JBG
       imgs = wikiJson['parse']['images']
       for img in imgs:
         img = img.encode('utf-8').strip()
         print ' - ' + img
         get_image(img)
     except Exception, e:
       print e
   
     try:
       wikistr += wikiJson['parse']['wikitext']['*'].encode('utf-8').strip()
       wikistr = re.sub(r'\|\d*(x\d*)?px', , wikistr) # Remove px info from images - JBG
       wikistr = re.sub(r'{{[A-Za-z0-9#:|/.?= \n&\-\\\”\{\}]*}}', , wikistr) # Remove youtube links - JBG
   
       # Replace internal wiki links with external links for footnotes - JBG
       for link in wikiJson['parse']['links']:
         link_str = link['*'].encode('utf-8').strip()
         prep_str = link_str.replace(' ', '_')
         wikistr = re.sub(r'\[\[' + link_str + '[A-Za-z0-9\(\)| ]*\]\]', '[' + wikiUrl + 'index.php/' + prep_str + ' ' + link_str + ']', wikistr) 
   
       f.write(wikistr)
     except Exception, e:
       print e

Pandoc command:

   pandoc --template handd-book.template handd-book.wiki -o handd-book.pdf --toc -M fontsize=12pt -M author='Hackers \& Designers' -M title='About Bugs, Bots and Bytes' --latex-engine xelatex -M mainfont='Helvetica' -M papersize='a5paper' -V links-as-notes

Code is available on Github.