Sunday 24 May 2015

E-Resume Formats Analysis

h-Resume and JSON Resume

I have been pondering CV formats of late, not because I am looking for another job, but because someone I know is, and they are getting increasingly annoyed with replicating the information they have onto application forms.

After looking around there seems to be two major contenders for a preferred format, h-resume and JSON Resume. I am going to take some time analysing these two formats and then compare and contrast them before making a decision as to what I will suggest my friend uses as a base format to store data for filling in application forms.

h-Resume

This microformat is a way of using standard HTML markup to allow the parsing of resume data. It uses a number of specific attributes in the markup to indicate the meaning of the content of the markup. It builds upon other microformats such as h-card and h-event and as such has an august ancestry.

I have used h-card extensively and implement it in all of the projects I can but think that I am in a minority. The idea of having the same data parsable by both humans and machines is something which has been an ideal for many years and is one of the strands of the Semantic Web and microformats could play a part in this ideal. As an idea the Semantic Web holds a great deal of promise but it has many detractors and to an extent has been superseded by other approaches. An analogy would be to think of content on the Semantic web being presented in two formats. One of which is understandable by humans and the other a shadow language which indicates the meaning of the human readable language to machines. The way that microformats manage this is best illustrated with an example:

This is one representation of me:

Dominic Myers
Senior Developer
Arcus Global Limited
dominic@arcusglobal.com
Future Business Centre,
King's Hedges Road
Cambridge,
Cambridgeshire,
CB4 2HY
+44(0)1223 911 841

If I were to put this information on a webpage I would use this markup:

<p>Dominic Myers</p>
<p>Senior Developer</p>
<p>Arcus Global Limited</p>
<p><a href="mailto:dominic@arcusglobal.com">dominic@arcusglobal.com</a></p>
<p>Future Business Centre,</p> 
<p>King's Hedges Road</p>
<p>Cambridge,</p>
<p>Cambridgeshire,</p>
<p>CB4 2HY</p>
<p>+44(0)1223 911 841</p>

This markup would tell the browser to display each of the lines in its own paragraph and the lines would be rendered one after the other on the page. If I were to use a microformat such as h-card (as I did using an online hCard Creator) I would get this markup:

<div id="h-card-Dominic-Richard-Myers" class="h-card">
    <a class="u-url p-name" href="http://dominicmyers.uk">
        <span class="p-given-name">Dominic</span>
        <span class="p-family-name">Myers</span>
    </a>
    <div class="p-org">Arcus Global Limited</div>
    <a class="u-email" href="mailto:dominic@arcusglobal.com">dominic@arcusglobal.com</a>
    <div class="p-adr">
        <div class="p-street-address">Future Business Centre, King's Hedges Road</div>
        <span class="p-locality">Cambridge</span>, 
        <span class="p-region">Cambridgeshire</span>, 
        <span class="p-postal-code">CB4 2HY</span>
        <span class="p-country-name">United Kingdom</span>
    </div>
    <div class="p-tel">+44(0)1223 911 841</div>
    <div>
        <span class="p-role">Senior Developer</span>
    </div>
</div>

In terms of the display of the data in the browser they are nearly identical (and could be made identical with some extra effort) and in terms of the informational content for the human reader they are identical, but the addition of those extra class attributes means that suitable parsers are able to make some sense of the data - would be able to offer to call me on the telephone and address a letter/parcel, might even show the position of my work-place on a map, the possibilities are almost endless. This is a powerful concept and similar to one ideal behind the Semantic Web: of intelligent software agents being able to search the internet for themselves and find data of relevance to their master.

h-resume uses h-card as well as h-event to display richer data about a person looking for employment. Its main properties are:

  • p-name - brief name of the resume
  • p-summary - overview of qualifications and objectives
  • p-contact - current contact info in an h-card
  • p-education - an education h-event event, years, embedded h-card of the school, location
  • p-experience - a job or other professional experience h-event event, years, embedded h-card of the organization, location, job-title
  • p-skill - a skill or ability, optionally including level and/or duration of experience
  • p-affiliation - an affiliation with an h-card organization

As such we can see that we markup a standard resume with machine parseable attributes according to 3 specific schemas. It must also be noted that all properties are optional. h-resume, like h-card and h-event, are all parts of the microformat2 specification and h-resume in particular is a draft specification. They are all directly related to microformat specifications as h-resume is an update to hResume, h-card is an update to hCard and h-event is an update of hCalendar.

It has an advantage of not making any demands upon the reader, the display of data can remain the same, extra attributes within the markup tags allow for the parsing by machines. The h-resume specification, along with h-card and h-event specifications, have only optional properties. The structure of a resume can be altered, for instance interspacing education and employment datums, and one would still imagine that parsers would be able to understand and place those datums into suitable arrays. To an extent this might be valuable for those who continue with pursuing educational attainment whilst at work as it would provide a more narrative resume.

JSON Resume

Whilst h-resume has its feet firmly planted within the sphere of XML, making use of markup attributes as it does, JSON Resume has a much more modern feel as it proudly uses JavaScript Object Notation (JSON) to store relevant data. It also seems to have a much more developer rather than Computer Scientist focus - as h-resume seems to do.

It has 10 properties compared to the 7 of h-resume:

  • basics - object
  • work - array of objects
  • volunteer - array of objects
  • education - array of objects
  • awards - array of objects
  • publications - array of objects
  • skills - array of objects
  • languages - array of objects
  • interests - array of objects
  • references - array of objects

JSON Resume is a new format and is on Version 0.0.0 with the schema being hosted on GitHub, as such people are free to adapt it to their requirements. Because it uses JSON we should perhaps discuss the data format more. JSON is a data-interchange format which is easily read by both humans and machines and is familiar to developers. It is made up of key/value pairs with the value possibly being an object, array or data primative. In terms of its formatted appearance it is similar to XML, being a nested representation of a collection of datums. In order to demonstrate this we could use the example I used above for my h-card:

{
    "full name": {
        "given name": "Dominic",
        "family name": "Myers"
    },
    "email": "dominic@arcusglobal.com",
    "organisation": {
        "name": "Arcus Global Limited",
        "address": {
            "street address": "Future Business Centre, King's Hedges Road",
            "locality": "Cambridge",
            "region": "Cambridgeshire",
            "postal code": "CB4 2HY",
            "country": "United Kingdom"
        }
    },
    "telephone": "+44(0)1223 911 841",
    "role": "Senior Developer"
}

It must be noted that this is not a representation of JSON Resume but merely an example of JSON. The exact same data is stored in the above JSON as was stored in the h-card and in a similar way in terms of the structure (please note the indentation as it indicated a datums position within a hierarchy of information in both formats). There is a distinct difference in terms of the legibility and the number of characters used (802 for XML and 516 for the JSON) though with JSON being significantly more descriptive and less opaque.

JSON Resume is not a display format though. While it is certainly readable by both humans and machines it is not designed for display in the same way that h-resume is. It is reliant upon a developer putting it through a process in order for a desirable display to be produced. This is both a strength and weakness as it allows for greater control of the display of the information but is reliant upon a certain level of technical knowledge.

Another strength is its use of arrays to hold collections of similar datums. This is perhaps a recognition of and a nod to the people whom JSON Resume is designed for as such a structure is familiar to developers and seems elegant.

Discussion

It is perhaps unfair to characterise r-resume as an XML format but it is understandable in that it uses HTML, which is a subset of XML. It is possible for a well formatted HTML document to be parsed as valid XML… I think we can be justified in calling it XML like at the very least.

While JSON Resume needs to be formatted in order to produce a meaningful display h-resume is already in a form designed for display. In their raw forms the reverse is true with JSON Resume being much more readable. To some extent comparing the two is the equivalent of comparing a chicken with an egg. One has the ability to be the other but the reverse is less true.

Properties

h-resume has 7 main properties whereas JSON Resume has 10:

Format: h-resume JSON Resume
Properties: p-name
p-summary
p-contact
p-education
p-experience
p-skill
p-affiliation
basics
work
volunteer
education
awards
publications
skills
languages
interests
references

There a a number of similar properties and it could be argued that both formats allow for the display of exactly the same information, directly analogous properties are shown in this illustration:

Further, it is possible to tentatively link properties from one to the other:

This is by no means an exhaustive merging of the fields from the two formats as I have simply tried to make the two merge by keeping my own Curriculum Vitae in mind. However p-name from h-resume and references from JSON Resume are left as orphan properties without some near exact equivalent - though, as noted above, it is possible for a person to massage relevant data into both formats without much difficulty.

Conclusion

As formats go I would argue that JSON Resume is perhaps the most flexible, despite it not having a p-name property - a property which is merely designed to provide a unique name for the resume - it offers far greater granularity in terms of the properties h-resume classifies as p-experience. Rather than lumping paid and unpaid work, awards and publications together JSON Resume allows these different areas to be separated. It might be argued that the same could be said for h-resume and that each area could be separated visually but marked up in the same way.

Both h-resume and JSON Resume both make a point of including skills but whereas JSON Resume has a suggested format for these characteristics h-resume proposes a format to include a vocabulary designed to describe a skill. I am not sure how to measure a skill and what metrics to use so the use of such a vocabulary might be worthwhile as it is proposed to include a name, rating and - most importantly - a duration. Someone who has practiced a skill for a greater length of time is likely to be more adept at that skill - just so long as they do not become stuck in their ways. A rating is likely to reference some form of professional qualification and this area is contentious as there are any number of bodies willing to provide certification depending upon the profession involved. JSON Resume counters this problem by having a distinct area for awards so the audience can make their own decision about whether a certification is from a valid professional body.

I have touched upon the fact that JSON Resume is designed to be a data interchange format rather than a display format and this means that to some extent the comparison is unfair. It is feasible for a developer to create a rendering theme which could turn JSON Resume into a perfectly valid h-resume document. This then means I would suggest that storing resume data in JSON Resume format offers the best approach, the format is flexible and parsers are designed to look for particular key/value pairs when displaying data - allowing for the possibility to extend those parsers to include other data which might be of relevance.

Addendum

There are many other Resume formats available with differing levels of support so this is by no means an extensive analysis, please do read about the other formats.

Friday 22 May 2015

FileSystem How-To

Cross posted from our work blog.

An example of the images we were trying to save to the device... ohh dear!

Introduction

We're presently developing a mobile application using Cordova and the whole process has been brilliant! Most recently we've been using the FileSystem of the mobile device we're using (more often than not a virtual Android device using Genymotion) so we've had to get our heads around the File plugin.

My team is comprised of fullstack (with a leaning towards the LAMP and MEAN stacks) developers and front-end developers so this is why we've had some problems getting our heads around this exciting world of the FileSystem. We're used to interacting with servers and putting stuff up there and not having to think about storing things on a user's machine... apart from session cookies (and even then we'll use the language's built in abilities more often than not). To an extent it almost felt a little dirty to get into the business of storing files on a machine.

In order to get started we looked at the plugin documentation and, when that started to get a little dry, we looked at HTML5 Rocks, and that was much more exciting! It must be noted though that the FileSystem API is only available on Chrome at present, but that's cool as we primarily develop in Chrome (I often spend more time coding in the console rather than my IDE).

While it was exciting we came across any number of head-scratching moments so I've decided to document them here, I hope it helps!

Getting access

It's all well and good to be able to play with the FileSystem but how do you get started? The first thing to do is to ask whether or not you can play!

Because we're developing on a virtual device and the Chrome browser (thankfully it's the main browser I use for development as - as I noted above - it's the only major browser to support the FileSystem API) we need to ask permission nicely. This did cause us some problems to start with but the thing to remember is that Chrome doesn't support the proper requestFileSystem function but has its own. It's not a major issue but it's still a concern. Another thing to bear in mind is that your Cordova app should always include the cordova.js library (it won't be there unless the app is built but if it's the last external JavaScript file you call then it's not an issue in development), because the cordova.js file adds a cordova object to the window object we can test on that to tell whether or not we need to do the substitution for Chrome. This is in the main file of our project and it works a treat:

var requestedBytes = 1024*1024*10; // 10MB
if(!window.cordova){
    window.requestFileSystem = window.requestFileSystem || window.webkitRequestFileSystem;
    navigator.webkitPersistentStorage.requestQuota (
        requestedBytes,
        function(grantedBytes) {
            window.requestFileSystem(
                PERSISTENT,
                requestedBytes,
                function(fs){
                    window.gFileSystem = fs;
                    fs.root.getDirectory(
                        "base",
                        {
                            "create": true
                        },
                        function(dir){
                         console.log("dir created", dir);
                        },
                        function(err){
                         console.error(err);
                        }
                    );
                },
                function(err) {
                    console.error(err);
                }
            );
        },
        function(err) {
            console.error(err);
        }
    );
}else{
    document.addEventListener(
        "deviceready",
        function() {
            window.requestFileSystem(
                PERSISTENT,
                requestedBytes,
                function(fs) {
                    window.gFileSystem = fs;
                    fs.root.getDirectory(
                        "base",
                        {
                            "create": true
                        },
                        function(dir){
                         // alert("dir created");
                        },
                        function(err){
                         alert(JSON.stringify(err));
                        }
                    );
                },
                function(err){
                    alert("Error: ", JSON.stringify(err));
                }
            );
        },
        false
    );
}

This is somewhat lengthy but I like the indentation as it gives me a nice indication of what's happening and where. In the function which uses the Cordova plugin we're using alerts whereas we use console.logging on Chrome. In order for the alerts to make sense I'm converting the err objects into strings. You can do this yourself now to test by simply entering this in the console (just so long as you're reading this in Chrome):

window.requestFileSystem = window.requestFileSystem || window.webkitRequestFileSystem;
navigator.webkitPersistentStorage.requestQuota (
    1024*1024*10,
    function(grantedBytes) {
        window.requestFileSystem(
            PERSISTENT,
            1024*1024*10,
            function(fs){
                window.gFileSystem = fs;
                fs.root.getDirectory(
                    "base",
                    {
                        "create": true
                    },
                    function(dir){
                     console.log("dir created", dir);
                    },
                    function(err){
                     console.error(err);
                    }
                );
            },
            function(err) {
                console.error(err);
            }
        );
    },
    function(err) {
        console.error(err);
    }
);

Hopefully you'll see an alert asking for permission to use Local data storage and, once you've granted access, the console should print out these 2 lines:

undefined
dir created

The undefined means that the function doesn't return anything and it isn't anything to worry about but the dir created means we've created a base directory How cool!

In order to better see the effects of what you've done download the HTML5 FileSystem Explorer Extended Chrome (FileSystem Explorer) plugin. Once you've installed it you'll be able to see an empty base directory in your permanent storage. If you have a way of serving files (and who doesn't?) you could do worse than looking at the HTML5 Rocks page I linked to above and running the sample code. It's really rather cool!

Working with Directories

I like to keep things somewhat organised, I guess I'm something of a neat freak, but that's okay really as it means I don't often lose things. As such I'd like to keep the files I want to work with organised. We also want our application to be enclosed in one folder so that if/when we make another app we can distinguish between documents hierarchies. So we'll create a base folder as soon as we get told we can play with the FileSystem. Within that folder we can create other folders. This way we can create a whole taxonomy of our applications concerns and we'll be able to CRUD things later confident that we know where things should go.

This process is somewhat philosophical though so I'd suggest giving it some thought. Coming from a LAMP background means that I'm all in favour of normalisation and believe in repeating myself as little as possible, being thrust into the world of MEAN means that I'm coming to embrace the "messy" approach of replicating data as and when needed as well as expanding database rows as I need to. This does require something of a shift from rigid to flexible thinking but getting my head around both ways of working has helped me become less programmatically autistic.

Once we've got a reference to our base directory we can start creating a folder tree with base being the trunk. This then allows us to create document leaves. One thing to remember though is that we can get directories and files… and we can get a reference to them even if they don't exist by telling the API to create them if they don't exist. But we do need to be careful about creating documents in folders that don't yet exist and this is the one big gotcha I'd like to point out to you in this piece. I'd like to tell you how we got around it too.

First though let's look at an example of getting a file and creating it if necessary:

gFileSystem.root.getFile(
    "base/test.txt",
    {
        create:true
    },
    function(file){
        console.log("Created File", file);
    },
    function(err){
        console.error(err);
    }
);

Again, you'll have to excuse the formatting (this is the last time I apologise about it though - I know we could do this as a one-liner but I like this passing of functions and I really do think functions deserve their own line… and if functions deserve their own lines surely objects and strings do too?). Hopefully if you enter this into the console you'll get a nice message saying Created File and an FileEntry object. Go ahead and expand that FileEntry object and check that it has these attributes: filesystem: DOMFileSystem, fullPath: "/base/test.txt", isDirectory: false, isFile: true and name: "test.txt".

That's grand isn't it? We've created a file... but we might want to write to it. We can do that by using a FileWriter on the FileEntry:

gFileSystem.root.getFile(
    "base/test.txt",
    {
        create:true
    },
    function(file){
        console.log("Got File", file);
        file.createWriter(
            function(fileWriter) {
                fileWriter.onwriteend = function(progress) {
                    console.log("Write completed", progress);
                };
                fileWriter.onerror = function(err) {
                    console.error("Write failed", err);
                };
                var blob = new Blob(
                    ['Lorem Ipsum'],                     {
                        type: 'text/plain'
                    }
                );
                fileWriter.write(blob);
            },
            function(err){
                console.error("Error creating writer", err)
            }         );
    },
    function(err){
        console.error(err)
    }
);

If you now navigate to base/test.txt in FileSystem Explorer and click on the file Chrome should open an new file with "Lorem Ipsum" in it. Cool ehh?

There are all sorts of other things that you can do with a fileWriter like append lines or save binary data like images. Have an explore but do remember that once you have a reference to the file the world's your lobster!

Gotcha

So we can write to a file once we have a reference to it, whether or not it exists, and we can use directories once we have a reference to them, whether or not they exist… what about getting a reference to a file that doesn't yet exist within a directory that doesn't yet exist. This is the major issue we dealt with and had us stumped for quite a while. If you replace base/test.txt with base/test/test.txt you'll get a lovely FileError object on the console with these attributes: code: 1, message: "A requested file or directory could not be found at the time an operation was processed." and name: "NotFoundError".

Clear as mud ehh?

Basically we need to make sure that the directory we're looking in exists before we can look for the file because while we set create to true for the getting of the file, we don't set create to true for the directory because we aren't using getDirectory. Phew! HTML5 Rocks has a lovely function for just this situation called createDir which accepts a DOMFileSystem object and an array representing a directory path (i.e. in our example: ["base", "test"]):

var path = 'base/test';

function createDir(rootDirEntry, folders) {
    // Throw out './' or '/' and move on to prevent something like '/foo/.//bar'.
    if (folders[0] == '.' || folders[0] == '') {
        folders = folders.slice(1);
    }
    rootDirEntry.getDirectory(
        folders[0],         {
            create: true
        },         function(dirEntry) {
            // Recursively add the new subfolder (if we still have another to create).
            if (folders.length) {
                createDir(dirEntry, folders.slice(1));
            }
        },         function(err){
            console.error(err);
        }
    );
};

createDir(gFileSystem.root, path.split('/'));

This is brilliant but we needed a reference to a file so we could write to it so we came up with this:

var path_file = "base/test/test.txt";

function createDirAndFile(rootDirEntry, folders, callback) {
    if (folders[0] == '.' || folders[0] == '') {
        folders = folders.slice(1);
    }
    rootDirEntry.getDirectory(
        folders[0],         {
            create: true
        },         function(dirEntry) {
            if (folders.length > 2) {
                createDirAndFile(dirEntry, folders.slice(1), callback);
            }else{
                callback();
            }
        },         function(err){
            console.error(err);
        }
    );
};

createDirAndFile(
    gFileSystem.root,     path_file.split('/'),
    function(){
        gFileSystem.root.getFile(
            path_file,
            {
                create:true
            },
            function(file){
                console.log("Got File", file);
                file.createWriter(
                    function(fileWriter) {
                        fileWriter.onwriteend = function(progress) {
                            console.log("Write completed", progress);
                        };
                        fileWriter.onerror = function(err) {
                            console.error("Write failed", err);
                        };
                        var blob = new Blob(
                            ['Lorem Ipsum'],                             {
                                type: 'text/plain'
                            }
                        );
                        fileWriter.write(blob);
                    },
                    function(err){
                        console.error("Error creating writer", err)
                    }                 );
            },
            function(err){
                console.error(err)
            }
        )        }
);

It's lovely because we use recursion to create the directory structure (in the same way as createDir did) before writing the file… it wasn't of a great deal of use in our most recent application as the hierarchy is really quite shallow but I think it will be of use later on. Go ahead and try it in the console, you should end up with the expected directory structure when you look at it with FileSystem Explorer.

Conclusion

The FileSystem API ROCKS! It can cause something of an existential crisis for those of us who not used to working with a FileSystem - except on the server - but once you get your head around it it really does make a lot of sense. Most of the articles use an error handler to deal with the errors that can occur but, if you're looking at the console trying to debug what went wrong where, the error handler function generally doesn't give you a line number. This is really, really annoying and we've ended up commenting out the internals of our functions to try and track down the bug, so we went back to adding the console.logs as methods to the relevant functions.