PerfectLearn Version 1.0 Beta User Feedback

Two weeks ago, I published the beta version of PerfectLearn and asked a number of people who had expressed a desire to try PerfectLearn to start testing it. These two weeks have been quite a ride. But, first things first. The quality of the feedback from the beta users has been nothing short of fantastic. Thank you.

The quality of the feedback from the beta users has been nothing short of fantastic.

Several bugs have been found and fixed. One of the bugs, specifically, turned out to be quite a nasty one. Luckily, switching to the PostgreSQL database has fixed the issue completely.

Using PerfectLearn

Using PerfectLearn

In addition to fixing the bugs, I have also decided to implement some of the suggestions based on feedback from the last two weeks. Two features specifically, have already been implemented: the (inline) quick help option and an interactive component for displaying related topics. A short screencast demonstrating the two features is available on YouTube, here.

Both features were straight-forward to implement and provide value to the user. On the one hand, the quick help option helps users to familiarize themselves with some of the applications’s more prominent user interface elements. On the other hand, the second feature, the related topics component, improves the user experience by showing only the necessary navigational information —the topic context— while at the same time making it possible to interact with said navigational information. In this respect, it’s important to note that the previous version of PerfectLearn displayed related topics, as well. However, it did this in a completely static way not allowing for the user to filter the related topics by association type and member role, respectively.

Finally, I hope (and expect) to finish PerfectLearn’s beta testing within the next fortnight. I feel privileged to have the users I have. I only hope that PerfectLearn lives up to your expectations.

Stay tuned for updates. Subscribe to the PerfectLearn newsletter.

How I used the CIA World Factbook to test my product

In preparation for the release of the first version of PerfectLearn, testing is the order of the day. To make the testing process both more realistic and more enjoyable I decided to load an external dataset into PerfectLearn to see how it handled a non-trivial topic map.


Screencast showing the CIA World Factbook data after it has been imported into PerfectLearn.

After searching online for a couple of hours I finally settled on the CIA World Factbook which in its own words “provides information on the history, people, government, economy, geography, communications, transportation, military, and transnational issues for 267 world entities.” All in all, the World Factbook is an interesting dataset that the CIA has made available for personal use.

The first thing to do when confronted with a task like this is to try to get a basic understanding of the nature of the data.

The first thing to do when confronted with a task like this is to try to get a basic understanding of the nature of the data. After examining the contents of the decompressed factbook.zip file I concluded that the following files and directories were sufficient to extract the necessary information to build the initial topic map ontology with some supporting images for each country’s topic:

  • geos
    • *.html: HTML documents for the 267 world entities.
    • print/country/*.pdf: the corresponding PDF documents for the 267 world entities.
  • graphics
    • flags/large/*.gif: country flags in GIF format.
    • maps/newmaps/*.gif: country maps in GIF format.
  • wfbExt
    • sourceXML.xml: XML file mapping country names, codes, and the corresponding regions.
CIA World Factbook Directory

CIA World Factbook Directory

There really is much more data available in the World Factbook than what I am alluding to. For example, in the fields and rankorder directories there is all kinds of data related to country comparisons (within several categories) and the appendix directory contains information about international organizations and groups, international environmental agreements, and so forth. Furthermore, there are both physical and political maps and population pyramids (in BMP format!) for all of the countries and territories. That is, the World Factbook is comprehensive to say the least.

With an initial understanding of the data the next step is to extract the information that is relevant for the current purpose. The HTML files in the geos directory provide the majority of the actual content for the countries, territories, and regions. In addition, the wfbExt/sourceXML.xml file (an excerpt of which is provided below) provides a convenient mapping between the countries and accompanying regions. That is, each country record in the sourceXML.xml file includes “name”, “fips”, and “Region” attributes which effectively links countries with regions while also providing the country code (the fips field) for the individual countries (and territories). The sourceXML.xml file will be crucial in the next phase when we are actually importing data into the topic map. For now, however, we need to focus on extracting the text for each country’s topic.

sourceXML.xml file excerpt

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<country>
	<country name="Afghanistan" fips="AF" Region="South Asia" />
	<country name="Akrotiri" fips="AX" Region="Europe" />
	<country name="Albania" fips="AL" Region="Europe" />
	<country name="Algeria" fips="AG" Region="Africa" />
	<country name="American Samoa" fips="AQ" Region="Oceania" />
	<country name="Andorra" fips="AN" Region="Europe" />
	<country name="Angola" fips="AO" Region="Africa" />
	<country name="Anguilla" fips="AV" Region="Central America" />
    ...
</country>

To painlessly extract data from HTML I normally resort to Apache Tika. Apache Tika is a Java library that makes it easy to extract meta data and text from numerous different file types, including (but not limited to) PDFs, Word files, Excel files, PowerPoint files, and, in this case, HTML files.

All in all, only two (Groovy) scripts are required to extract the text from HTML files and import the data into PerfectLearn while at the same time creating the necessary relationships between the topics. What the first script, Extract.groovy (provided below), does is relatively straightforward. First of all, it imports the necessary Apache Tika classes (lines 7-11), defines the source and target paths for the directory with the original HTML files and the directory to write the text files with the extracted text (lines 14-19), followed by creating the target directory (line 25). Next, the extraction of text from the HTML files starts by iterating over all of the HTML files (in the source directory) and calling the extractContent function to actually extract the textual content from each of the HTML files which is subsequently written to a file in the processed directory (lines 27-39). The extractContent function is the most complex code in this script but all it does is request Tika to return the content of the document’s body as a plain-text string by removing all the HTML-related markup (lines 45-65) after which the extracted text is passed to the sanitize function (lines 71-79) to remove superfluous text and to inject some markup to ensure better legibility of the text when it’s finally rendered in PerfectLearn. As you can see, Tika is doing the vast majority of the heavy lifting in this script.

Extract.groovy

/*
Extract country text script (from accompanying HTML files)
By Brett Alistair Kromkamp
January 09, 2015
*/

import org.apache.tika.Tika
import org.apache.tika.metadata.Metadata
import org.apache.tika.parser.html.HtmlParser
import org.apache.tika.parser.ParseContext
import org.apache.tika.sax.BodyContentHandler

// ***** Constants *****
final def ORIGINAL_PATH = '/home/brettk/Source/groovy/perfectlearn-miscellaneous/cia-factbook/data/original/geos'
final def PROCESSED_PATH = '/home/brettk/Source/groovy/perfectlearn-miscellaneous/cia-factbook/data/processed/geos'

// ***** Setup *****
def originalDirectory = new File(ORIGINAL_PATH)
def processedDirectory = new File(PROCESSED_PATH)

// ***** Logic *****
println 'Starting extraction process.'

// Create 'processed' directory.
processedDirectory.mkdirs() // Non-destructive.

originalDirectory.eachFile { file ->
    if (file.isFile() && file.name.endsWith('.html')) {
        def textFileName = generateTextFileName(file.name.toString())

        // Create file with extracted text.
        def textFile = new File("$PROCESSED_PATH/$textFileName")
        textFile.withWriter { out ->
            def textContent = extractContent(file.text)
            println textFileName
            out.writeLine(textContent)
        }
    }
}

println 'Done!'

// ***** Helper functions *****

String extractContent(String content) {
    BodyContentHandler handler = new BodyContentHandler()
    Metadata metadata = new Metadata()
    InputStream stream

    def result = ''
    try {
        if (content != null) {
            stream = new ByteArrayInputStream(content.getBytes())
            new HtmlParser().parse(
                stream, 
                handler, 
                metadata, 
                new ParseContext())
            result = sanitize(handler.toString()).trim()
        } 
        return result
    } finally {
        stream.close()
    }
}

String generateTextFileName(String htmlFileName) {
    return htmlFileName.replaceAll(~/\.html/, '') + '.txt'
}

String sanitize(String content) {
    return content
        .replaceAll(~/(?m)^\s+/, '')
        .replaceAll(~/(?s)^Javascript.*Introduction ::/, 'Introduction ::')
        .replaceAll(~/(?s)EXPAND ALL.*/, '')
        .replaceAll(~/(?m)^([A-Z].*\s+)::.*/, '<h2>$1</h2>')
        .replaceAll(~/(?m)^([a-z])([a-z|\s]*):/, '<strong>$1$2</strong>: ')
        .replaceAll(~/(?m)^([A-Z])([a-z|\s|-]*):/, '<h3>$1$2</h3>')
}

The next script, Import.groovy (provided below), although longer than the previous one, is relatively straightforward, as well. The important thing to realize with this script is that its main function is to iterate over the, previously mentioned, sourceXML.xml file to create and store the countries, territories, and regions (as topics) in the topic map. First of all, the script imports the necessary Java libraries including the PerfectLearn topic map engine (lines 8-17), followed by setting up the necessary constants for paths, database-related parameters, and other miscellaneous values (lines 21-32). The next thing it does is instantiate the PerfectLearn topic map engine (line 38) and creates some required topics for the World Factbook topic map ontology (lines 42-56). Once the necessary topics have been created, the sourceXML.xml is loaded and the country/territory/region records are read into a list (lines 62-64) for subsequent iteration (line 69). On each iteration the following actions are performed:

  • The required region identifier, region name, country identifier, country name, and country code are extracted for subsequent use (lines 72-77).
  • The textual content for each country, territory, or region is retrieved from the appropriate text file that was generated by the Extract.groovy script (lines 81-86).
  • The background information, excerpt, and timeline year are retrieved for each country, territory, or region to create the necessary meta data for subsequent display in the timeline component (lines 90-105, and lines 256-272, 274-283, 285-290, for the getBackgroundExcerpt, getBackground, and getTimelineYear functions, respectively).
  • The country or territory topic is created and stored (lines 109-114).
  • The country or territory text occurrence is created and stored (lines 118-126).
  • The region topic is created and stored (lines 130-137).
  • The association (that is, relationship) between a country or territory and its concomitant region is stored (line 141).
  • Coordinates are extracted from the country’s textual content and if the second set of coordinates is present (for the capital city), the meta datum with the coordinates is created and stored for subsequent visualization in the map component. The convertToDdCoordinates function is called with the extracted coordinates to convert from a degrees-minutes-seconds format to a decimal degrees format which is the required format for Google Maps (lines 145-149, and lines 241-254 for the convertToDdCoordinates function).
  • A link (occurrence) is added for each country, territory, or region pointing back to the appropriate page in the CIA World Factbook website (lines 153-161).
  • The flag (occurrence) is added for each country (lines 165-179, and lines 292-301 for the copyFile function).
  • The map (occurrence) is added for each country or territory (lines 185-195, and lines 292-301 for the copyFile function).

Next, the associations to establish the appropriate relationships between the regions themselves and between the regions and the "world" (topic) are created and stored in the topic map (lines 201-219). Finally, the textual content for the world topic is retrieved (lines 223-226) and the accompanying occurrence is created and saved (lines 228-236).

Import.groovy

/*
Import CIA World Factbook into PerfectLearn Topic Map Engine
By Brett Alistair Kromkamp
January 15, 2015
*/

// Import necessary Java libraries including the PerfectLearn topic map engine.
import com.polishedcode.crystalmind.base.Utils
import com.polishedcode.crystalmind.base.Language;
import com.polishedcode.crystalmind.map.store.TopicStore;
import com.polishedcode.crystalmind.map.store.TopicStoreException;
import com.polishedcode.crystalmind.map.model.*

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

// ***** Constants *****
// Setup necessary paths, database-related parameters, and other miscellaneous values.
final def COUNTRIES_PATH = '/home/brettk/Source/groovy/perfectlearn-miscellaneous/cia-factbook/data/original/wfbExt/sourceXML.xml'
final def MAPS_PATH = '/home/brettk/Source/groovy/perfectlearn-miscellaneous/cia-factbook/data/original/graphics/maps/newmaps'
final def FLAGS_PATH = '/home/brettk/Source/groovy/perfectlearn-miscellaneous/cia-factbook/data/original/graphics/flags/large'
final def PROCESSED_PATH = '/home/brettk/Source/groovy/perfectlearn-miscellaneous/cia-factbook/data/processed/geos'

final def DATABASE = 'pldb_1'
final def SHARD_INFO = "localhost;3306;${DATABASE}"
final def USERNAME = '********'
final def PASSWORD = '********'
final long TOPIC_MAP_IDENTIFIER = 64L
final def COUNTRIES_TOTAL = 268
final def UNIVERSAL_SCOPE = '*'

// ***** Logic *****
println 'Starting importing process.'

// Instantiate the PerfectLearn topic store.
TopicStore topicStore = new TopicStore(USERNAME, PASSWORD)

// Bootstrap required topics.
println 'Bootstrapping...'
def bootstrapTopics = [
	new Entity(identifier: 'country', name: 'Country', instanceOf: 'topic'),
	new Entity(identifier: 'region', name: 'Region', instanceOf: 'topic'),
	new Entity(identifier: 'world', name: 'The World', instanceOf: 'topic'),
	new Entity(identifier: 'part-of', name: 'Part Of', instanceOf: 'topic')
]

bootstrapTopics.each { bootstrapTopic ->
	Topic topic = new Topic(
		bootstrapTopic.identifier,
		bootstrapTopic.instanceOf,
		bootstrapTopic.name, 
		Language.EN)
	topicStore.putTopic(SHARD_INFO, TOPIC_MAP_IDENTIFIER, topic, Language.EN)
}

/*
Iterate over country records (in sourceXML.xml) by extracting the necessary
attributes to create countries, territories, and regions.
*/
def countriesContent = new File(COUNTRIES_PATH).text
def countriesXml = new XmlSlurper().parseText(countriesContent)
def countries = countriesXml.country

assert COUNTRIES_TOTAL == countries.size()

println 'Iterating over countries...'
for (country in countries) {
	// For each country/territory/region extract the region identifier, 
	// region name, country identifier, country name, and country code. 
	def regionIdentifier = Utils.slugify(country.@Region.text())
	if (regionIdentifier) {	
		def regionName = country.@Region.text()
		def countryIdentifier = Utils.slugify(country.@name.text())
		def countryName = country.@name.text() 
		def countryCode = country.@fips.text().toLowerCase()

		// Get topic's text.
		println "Getting topic's text..."
		def topicContentPath = "$PROCESSED_PATH/${countryCode}.txt"
		def topicContentFile = new File(topicContentPath)
		def topicContent = ''
		if (topicContentFile.exists()) {
			topicContent = topicContentFile.text
		}

		// Extract the country's background excerpt.
		println "Extracting the country's background excerpt..."
		if (topicContent) {
			def excerpt = getBackgroundExcerpt(topicContent)
			def background = getBackground(topicContent)

			// Add the appropriate timeline related meta data.
			println 'Adding the timeline metadata...'
			if (excerpt && background) {
				def timelineYear = getTimelineYear(background)
				def timelineMedia = "<blockquote>${excerpt.find(~/(?s)^\S*^(.*?)[.?!]\s/).trim()}<blockquote>".toString()
				if (timelineYear && timelineMedia && excerpt) {
					topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'timeline-event-startdate', timelineYear, countryIdentifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)
					topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'timeline-media', timelineMedia, countryIdentifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)
					topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'timeline-text', excerpt, countryIdentifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)
				}
			}
		}

		// Create and store the country or territory topic.
		println 'Creating and storing the country topic...'
		Topic countryTopic = new Topic(
			countryIdentifier,
			'country',
			countryName, 
			Language.EN)
		topicStore.putTopic(SHARD_INFO, TOPIC_MAP_IDENTIFIER, countryTopic, Language.EN)

		// Create and store the topic's text occurrence.
		println "Creating and storing the topic's text..."
		Occurrence occurrence = new Occurrence(countryIdentifier)
		occurrence.with {
			instanceOf = 'text'
			scope = UNIVERSAL_SCOPE
			language = Language.EN
			resourceData = topicContent.getBytes()	
		}
		topicStore.putOccurrence(SHARD_INFO, TOPIC_MAP_IDENTIFIER, occurrence)
		topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'label', countryName, occurrence.identifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)

		// Create and store the region topic.
		println 'Creating and storing the region topic...'
		if (!topicStore.topicExists(SHARD_INFO, TOPIC_MAP_IDENTIFIER, regionIdentifier)) {
			Topic regionTopic = new Topic(
				regionIdentifier,
				'region',
				regionName, 
				Language.EN)
			topicStore.putTopic(SHARD_INFO, TOPIC_MAP_IDENTIFIER, regionTopic, Language.EN)
		}

		// Create associations between countries and regions.
		println 'Creating associations between countries and regions...'
		topicStore.createAssociation(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'country', countryIdentifier, 'region', regionIdentifier)

		// Create coordinates metadatum for each country's capital.
		println "Creating coordinates for country's capital..."
		def coordinates = topicContent.findAll(~/(?m)(^[-+]?\d{1,2}\s*\d{1,2}\s*[A-Z]),\s*([-+]?\d{1,2}\s*\d{1,3}\s*[A-Z])/)
		if (coordinates[1]) {
			ddCoordinates = convertToDdCoordinates(coordinates[1])
			topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'map-coordinates', ddCoordinates, countryIdentifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)
		}

		// Add link occurrence to each topic pointing to the original CIA World Factbook country page. 
		println 'Adding CIA World Factbook country page link...'
		Occurrence linkOccurrence = new Occurrence(countryIdentifier)
		linkOccurrence.with {
			instanceOf = 'url'
			scope = UNIVERSAL_SCOPE
			language = Language.EN
			resourceRef = "https://www.cia.gov/library/publications/the-world-factbook/geos/${countryCode}.html"
		}
		topicStore.putOccurrence(SHARD_INFO, TOPIC_MAP_IDENTIFIER, linkOccurrence)
		topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'label', "$countryName CIA World Factbook Page", linkOccurrence.identifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)

		// Add flag (occurrence) to each topic and copy image to appropriate (web application resources) directory.
		println 'Adding flag...'
		def imageDirectoryName = "/home/brettk/www/static/$TOPIC_MAP_IDENTIFIER/images/$countryIdentifier"
		def imageDirectory = new File(imageDirectoryName)
		imageDirectory.mkdirs() // Non-destructive.

		def serverImageDirectoryName = "/static/$TOPIC_MAP_IDENTIFIER/images/$countryIdentifier"
		
		Occurrence flagOccurrence = new Occurrence(countryIdentifier)
		flagOccurrence.with {
			instanceOf = 'image'
			scope = UNIVERSAL_SCOPE
			language = Language.EN
			resourceRef = "$serverImageDirectoryName/${flagOccurrence.identifier}.gif"
		}
		topicStore.putOccurrence(SHARD_INFO, TOPIC_MAP_IDENTIFIER, flagOccurrence)
		topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'label', "$countryName (Flag)", flagOccurrence.identifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)

		copyFile("$FLAGS_PATH/${countryCode}-lgflag.gif", "$imageDirectoryName/${flagOccurrence.identifier}.gif")

		// Add map (occurrence) to each topic.
		println 'Adding map...'
		Occurrence mapOccurrence = new Occurrence(countryIdentifier)
		mapOccurrence.with {
			instanceOf = 'image'
			scope = UNIVERSAL_SCOPE
			language = Language.EN
			resourceRef = "$serverImageDirectoryName/${mapOccurrence.identifier}.gif"
		}
		topicStore.putOccurrence(SHARD_INFO, TOPIC_MAP_IDENTIFIER, mapOccurrence)
		topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'label', "$countryName (Map)", mapOccurrence.identifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)

		copyFile("$MAPS_PATH/${countryCode}-map.gif", "$imageDirectoryName/${mapOccurrence.identifier}.gif")
	}
}

// Create associations between regions.
println 'Creating associations between regions...'
def regionIdentifiers = [
	'africa',
	'central-america',
	'central-asia',
	'east-asia',
	'europe',
	'middle-east',
	'north-america',
	'oceania',
	'south-america',
	'south-asia'
]
for (outerRegionIdentifier in regionIdentifiers) {
	for (innerRegionIdentifier in regionIdentifiers.findAll { it != outerRegionIdentifier } ) {
		topicStore.createAssociation(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'region', outerRegionIdentifier, 'region', innerRegionIdentifier)
	}
	// Create associations between the world topic and the regions.
	topicStore.createAssociation(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'part-of', 'world', 'region', outerRegionIdentifier)
}

// Add the appropriate text occurrence ('xx.txt') to the 'world' topic.
println "Adding text occurrence to the 'World' topic..."
def worldTopicContentFileName = "${PROCESSED_PATH}/xx.txt"

def worldTopicContentFile = new File(worldTopicContentFileName)
def worldTopicContent = worldTopicContentFile.text

Occurrence worldOccurrence = new Occurrence('world')
worldOccurrence.with {
	instanceOf = 'text'
	scope = UNIVERSAL_SCOPE
	language = Language.EN
	resourceData = worldTopicContent.getBytes()
}
topicStore.putOccurrence(SHARD_INFO, TOPIC_MAP_IDENTIFIER, worldOccurrence)
topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'label', 'world', worldOccurrence.identifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)

println 'Done!'

// ***** Helper methods *****
def convertToDdCoordinates(String dmsCoordinates) { // Format: 17 49 S, 31 02 E
	// http://en.wikipedia.org/wiki/Geographic_coordinate_conversion
	def parts = dmsCoordinates.replace(',', '').split(' ')

	def ddLatitude = parts[0].toInteger() + (parts[1].toInteger() / 60) 
	if (parts[2] == 'S') {
		ddLatitude = 0 - ddLatitude
	}
	def ddLongitude = parts[3].toInteger() + (parts[4].toInteger() / 60)
	if (parts[5] == 'W') {
		ddLongitude = 0 - ddLongitude
	}
	return "($ddLatitude, $ddLongitude)"
}

def getBackgroundExcerpt(String content) {
	def result = content
		.find(~/(?s)<\/h3>.*<h2>Geography/)
		?.replaceAll(~/<\/h3>/, '')
		?.replaceAll(~/<h2>Geography/, '')
	if (result) {
		if (result.size() > 320) {
			result = result[0..320]
		}
		if (result[-1] != '.') {
			result = result << '...'
		}
	} else {
		result = ''
	}
	return result.toString()
}

def getBackground(String content) {
	def result = content
		.find(~/(?s)<\/h3>.*<h2>Geography/)
		?.replaceAll(~/<\/h3>/, '')
		?.replaceAll(~/<h2>Geography/, '')
	if (result == null) {
		result = ''
	}
	return result
}

def getTimelineYear(String content) {
	def bcYears = content.findAll(~/\d{4}\sB.C./).collect { it.replace(' .B.C.', '') }
	def adYears = content.findAll(~/\d{4}/)
	def years = adYears - bcYears
	return years[0]
}

def copyFile(String sourcePath, String targetPath) {
	Path source = Paths.get(sourcePath)
	Path destination = Paths.get(targetPath)

	try {
		Files.copy(source, destination);
	} catch (IOException e) {
		e.printStackTrace();
	}
}

// ***** Models *****

class Entity {
	String identifier
	String name
	String instanceOf
}

And that’s it, folks! In a follow-up article I will document how to improve the import process outlined in this article to make much better use of the resources provided by the World Factbook. However, on this first iteration, the current import process provides me with sufficient data to thoroughly test PerfectLearn with a non-trivial topic map.

Stay tuned for updates. Subscribe to the PerfectLearn newsletter.

Topic map templates

An application like PerfectLearn can be used in a wide variety of scenarios. Although I normally think of PerfectLearn within the learning space, a friend of mine has been testing PerfectLearn and using the application to help him manage some of his work-related projects. While discussing his experience with PerfectLearn we came up with the idea of, what I am now calling, "topic map templates."

Network Graph

Network Graph

A topic map template, for all intents and purposes, is a predefined set of placeholder topics and the accompanying relationships between those topics. Selecting a topic map template from a list of available templates (from within PerfectLearn) would automatically generate a set of topics and associations after which the user could carry on fleshing-out the topics and the associations as if they had been manually created by the user himself.

Advantages of using a topic map template include saving time (as you don’t have to manually create the topics and the relationships between the topics) and being able to repeatedly create a standardized topic map structure.

In the case of my friend, he found that he was creating the same kind of topics and relationships for the different projects he was managing, over and over again. For example, he was creating the project topic itself, followed by creating topics for the project’s requirements, stakeholders, project deliverables, time frames, and so forth. And after creating the necessary topics he would then have to create the accompanying associations between the topics. All in all, it was an error-prone and time consuming process. A process that, if you think about it, is unnecessary. It is something that can be automated away.

With regards to topic map templates, I envisage the PerfectLearn community submitting templates to a central repository that other PerfectLearn users could browse or search from within the application. When a user finds an appropriate template the user downloads it after which it would be available within PerfectLearn to create the desired topic map structure. To ensure that the user has an accurate overview of what exactly the template provides, on selecting a template, a live visual graph of the template would be displayed together with a description (provided by the user who submitted the template to the repository). Furthermore, the individual templates would be categorized, tagged and user-rated to ensure that the most useful templates are easily found.

Topic map templates could span from the simple (only a handful of topics and associations) to the complex (dozens, or perhaps even hundreds of topics and accompanying associations). In that respect, being able to undo or roll-back applying a template should be possible to ensure that the user is not straddled with topics and associations that in retrospect are not applicable to his or her requirements.

Finally, I expect to start implementing the topic map template feature in version 1.2 of the application. If you have any suggestions with regards to the template feature let me know by submitting the feedback form.

Stay tuned for updates. Subscribe to the PerfectLearn newsletter.

PerfectLearn feedback

Over the last couple of weeks I have released several screencasts (on the PerfectLearn YouTube Channel) in which I attempt to explain how PerfectLearn works in conjunction with providing an overview of PerfectLearn’s benefits.

The feedback from numerous people that have watched the videos has been overwhelmingly positive. Not only have people expressed their interest in PerfectLearn but also, and perhaps more surprisingly, I have also received a lot of very useful insights at both a product level and at a market(ing) level.

With this blog article it is my intention to capture (in no particular order) for future reference what I consider to be the most important insights I have obtained from discussing PerfectLearn with several people after having published the screencasts on YouTube.

User Data

There is nothing more important than knowledge and specifically when talking about individuals, their personal (documented) knowledge is of inestimable value. Hence, for a user knowing that their investment in PerfectLearn is safe because if necessary they can get access to a full dump of their data/documented knowledge is an important consideration when evaluating an application like PerfectLearn. In that respect, it makes sense to offer several ways for a user to be able to export their data, including JSON and XML dumps, HTML, and perhaps even Markdown.

User Context and Touchpoints

Users will be accessing and using PerfectLearn in different locations, contexts, and on different devices. Obviously, one size doesn’t fit all. My thinking in that respect has been heavily influenced by the concepts of touchpoints and cross-channel blueprints (as outlined in the article Cross Channel Design With Alignment Diagrams). Specifically, I am leaning towards the following cross-channel blueprint:

In the above graph you will see how each touchpoint (that is, phone, tablet, and desktop computer) is different with respect to the main user intentions/interactions. That is, on a phone, knowledge acquisition is the most important user intent, on a tablet the intents are more equally divided between knowledge acquisition, knowledge surfacing, and knowledge organisation. And finally, on a desktop machine, the user is probably more focused on the actual organisation of knowledge. The actual touchpoint proportions are arguable but the principle of an application behaving differently depending on the user’s current touchpoint and context is valid.

Target Groups

With regards to marketing PerfectLearn and the accompanying communication of PerfectLearn’s benefits it makes sense to focus on specific user needs.

The plan is to first focus on knowledge (management) geeks and life-long learners. The next target group will be people who are researching/investigating one or more topics of interest (both professionally and non-professionally). Finally, the third target group will be both teachers and students. Obviously, these three groups are not mutually exclusive and it is more than likely that there is at least some overlap between them.

The important lesson to take away from this point is that you really do need to understand each group’s unique pain points and ensure that you effectively communicate how your product addresses those pain points.

Stay tuned for updates. Subscribe to the PerfectLearn newsletter.

Topic maps engine for mobile platforms

As part of the PerfectLearn project I am building an on-device topic maps engine (ported to both Android and iOS). For testing purposes, I decided to import the King James Version (KJV) Bible into a PerfectLearn-compatible topic maps format which in turn can be exported to a SQLite file format for subsequent use by the on-device topic maps engine.

The video below shows the navigation of the KJV Bible starting at the “root” navigation topic and successively drilling down until finally reaching Genesis Chapter 1 and Genesis Chapter 2, respectively.

Codex Android app King James Version (KJV) Bible navigation. from Brett Kromkamp on Vimeo.

Both the Android and iOS versions of the topic maps engine will be compatible in terms of their models (that is, the data entities, including topics, occurrences, associations, base names, and metadata) and API (for example, putTopic, getTopic, getAssociation).

The topic maps engine has a low memory footprint, is thread-safe and has the full expressive power of topic maps including (but not limited to) scopes, multilingual support (that is, multilingual base names, occurrences, and metadata), and full text search.

Topic maps technology is an “enabler” in terms of what it allows you to build. Specifically, on-device topic map engines can form the basis of a wide range of different app types. Many app categories are perfect targets for both external and (especially) on-device topic map-based solutions, including (but not limited to):

Finally, (on-device) topic maps engines make for a compelling use case in Glassware.

Update (January 15, 2014): Some people have asked what makes an on-device topic maps engine compelling. Well, one of the most requested features of mobile apps is the ability to function without internet access. On-device topic map engines allow you to do exactly that without sacrificing any advanced capabilities. In addition:

  • In many (developing) countries, sporadic internet access allows people to download your app but not to actually interact with the app’s content / functionality in a consistent manner, resulting in a degraded user experience.
  • When (open) WiFi access is not available, expensive mobile data plans make off-line content delivery platforms attractive.

Furthermore, having an on-device topic maps engine does not imply that the content on the device would not be (periodically) updated. Ideally, an app determines (on startup) if it has network access and if it does it subsequently checks for new content. If new content is available, it downloads and updates the topic map store with the new content.

As a sidenote, if you watched the video you will have noticed that the navigation lists are unordered. That is because I have not implemented any sorting on the item objects, yet. Secondly, the retrieval of associations is sometimes quite slow which is due to the app running on an emulator (which are notoriously slow) in combination with some unoptimised code in the topic map engine.

Stay tuned for updates. Subscribe to the PerfectLearn newsletter.

PerfectLearn development update August 2013

I started the development of PerfectLearn almost four months ago and good progress has been made with the project. Although, every time I think that I’m close to finishing, I realise that the application is still missing essential functionality. I published my initial list of outstanding tasks over a month ago when I thought that I was in the final stage of the project. Now, one month later and I estimate that I have at least another three or four weeks to go before I can confidently declare that PerfectLearn is ready for release. You know… the whole Ninety-ninety rule.

PerfectLearn web application

PerfectLearn web application

Taking a look at the application’s todo list shows that I still have the following tasks to complete (grouped by task type):

Application development

  • Add and view tags. Tagging is an awesome way to organise and subsequently find your documented knowledge and its implementation is absolutely imperative. I’m going to implement tagging using associations making automatic categorisation of your information possible. The whole automatic categorisation of information thing is something that you have to experience to see how it works in practice.
  • Edit topic comments. At the moment, you can add and remove topic comments but you cannot edit them. It’s a quick thing to implement, but somehow I just haven’t got around to doing it.
  • Delete images and attachments. Uploading images and attachments for subsequent viewing and downloading is done; being able to delete them, however, is still pending implementation. With regards to uploading images and attachments, the current implementation stores the files on the same server where the application is running. I am, however, considering swapping out the current implementation with an Amazon S3 implementation.
  • Edit links. Adding links to the current topic and removing links from the current topic is finished, but I still have to implement the ability to edit links. Again, it’s a quick piece of functionality to implement. Nevertheless, it has taken a back-seat to more compelling features. Obviously, it needs to be finished before release.
  • Add, edit and remove metadata. Behind the scenes, the PerfectLearn topic map engine uses the concept of metadata to complement the various entities (that is, topics, occurrences, and associations) within the application with additional information. A non-admin user is never exposed to the concept of metadata within the application, that is, metadata management is transparent to the “normal” user. However, the admin user, does have the ability to manually manage metadata and it’s this user interface-related functionality that is still pending implementation.
  • Topic search. I’m of the opinion that search is of less importance in a topic map-based system compared to a non-topic map based system due to the inherent ease that the former provides you in terms of exploratory navigation of your documented knowledge. Nonetheless, having full text search is obviously very useful when you just need to quickly find whatever you are looking for without much ceremony. I am still undecided as to which search engine I will use to implement search within PerfectLearn. Currently, I am reviewing both elasticsearch and Apache Solr, both based on the Apache Lucene engine.
  • Translation of the application’s user interface into Spanish. According to Wikipedia, Spanish is the third-most used language on the web (2011 figures) with over 160 million users. In this context, it is also interesting to note that PerfectLearn has full support for multi-lingual content. That is, as a user you can easily switch between managing textual and binary content for different languages. Translation of the application’s user interface to Spanish is already on-going.
  • Supplemental navigation systems, including a topic index and next topic and previous topic navigation. Good Information Architecture (IA) advocates that it makes sense to provide not just, the so-called, “embedded” navigation systems (that is, global, local, and contextual navigation) but also supplemental and social navigation systems like (topic) indexes and tags.
  • Google Drive integration. Google Drive is a powerful file storage and real-time collaboration environment. Being able to organise and access your Google Drive documents by topic from within the application is a very compelling feature. I will add Google Drive support after the first release of PerfectLearn.
  • Client-side form validation. I have already implemented server-side validation. Nonetheless, it only makes sense to include client-side validation to reduce the application’s network chattiness and to provide a more streamlined user experience.
  • Upgrade to Twitter Bootstrap 3 (including typeahead.js integration). Who within the web community hasn’t heard of Twitter Bootstrap yet? It is a very powerful front-end web framework that makes it very easy to implement clean and functional user interfaces. Recently, version 3.0 of the project was released boasting a “mobile-first” approach making it a no-brainer with regards to upgrading from version 2 of the framework.
  • User profile page. Self-explanatory.
  • Browse user portfolios and view individual portfolios. The personal network and portfolio are both dimensions of the personal learning environment and, in that respect, PerfectLearn has the ability to publish individual topics from your documented knowledge repository into your online, publicly accessible learning portfolio. The work on this part of the application is already ongoing. I just need to refine and polish the experience.
  • LinkedIn integration. From my point of view, a person using a personal learning environment will do so for several reasons. Obviously, managing their documented knowledge in an effective manner is probably quite high on that list. In addition, being able to evidence your knowledge (to, for example, a prospective employer) is equally important and that is where the “portfolio” aspect of a personal learning environment comes into play. Having the ability to surface your knowledge directly within your LinkedIn Activity feed provides you, the user, with real value.
  • Atom syndication format-based web feed for the user’s portfolio. Self-evident.

Back-end development

  • getPublishedTopicReferences method. This method retrieves all of a topic’s related topics that have also been published in the user’s learning portfolio (explicitly excluding those topics that haven’t been publicly published) so as to provide the navigational context of a public topic without linking to unpublished or private topics.

Marketing

It doesn’t matter how good your product or service is, if nobody knows about its (relative) merits you are as good as dead in the water. In that respect, in parallel to the on-going development of PerfectLearn, I am actively pursuing the marketing-related activities outlined below.

  • Listening to (and acting upon) user feedback. Several people are providing me with on-going constructive criticism with regards to the application’s feature set, user experience, and in general, its value proposition. Although I have a strong product vision for PerfectLearn, it only makes sense to listen to people who have valuable insights into your product’s market and use cases so as to incorporate valid suggestions into the application.
  • Writing product tutorials (including screencasts). I have a list of tutorials and accompanying screencasts to educate (excuse the pun) prospective users with regards to PerfectLearn’s feature set:
    • Web queries overview
    • How to create a topic
    • How to create a simple association
    • How to create a non-trivial association
    • How to add a member to an association
    • How to add a topic reference to a member of an association
    • How to customise the semantic web queries (for semantically related articles, videos, images, and news stories)
    • Language switching and its consequences
    • How to manage your online learning portfolio
  • Engaging with influencers within the EdTech and personal learning environment space:

Legal mumbo jumbo

  • Terms & conditions. Self-explanatory.
  • Privacy policy. Self-explanatory.

For the moment, the above list outlines what is still pending with regards to PerfectLearn’s implementation before I can release a beta version of the application. If you have any suggestions with regards to what you have seen up until now, I will be grateful for your feedback.

Stay tuned for more tutorials and screencasts and sign up for the newsletter and get the latest in updates. Subscribe to the PerfectLearn newsletter.

Overview of PerfectLearn

The screencast below will walk you through signing in to PerfectLearn, creating two topics, and viewing the web queries for related study materials of the two newly created topics.

PerfectLearn’s web queries overview from Brett Kromkamp on Vimeo.

To sign in, you use the username and password that you provided during the sign-up process. Once you have signed in, you will see PerfectLearn’s intro page. To access your topics, just click on the big green Start learning button which will take you to the Front page topic, the starting point from which you will be able to access all of your documented knowledge.

To create a topic, click on the Topic dropdown and select Create topic from the menu and type "Pablo Picasso" in the Name field. PerfectLearn will automatically generate the topic identifier, in this case pablo-picasso, based on the name that you provided for the topic (the significance of the topic identifier will be explained in a future tutorial). Once you have provided the topic’s name (with the accompanying generated topic identifier), click on the green Submit button to create the topic.

Now, let’s create another topic. The difference between creating this topic and the previous topic is that this time around we will edit the generated topic identifier and provide a shorter topic identifier. Just as before, click on the Topic dropdown and choose Create topic from the menu. In the Name field, type "North Atlantic Treaty Organization" followed by clicking on the Edit button located next to the Identifier field and change the identifier from "north-atlantic-treaty-organization" to "nato".

Before moving on, a couple of observations with regards to what we have seen up until now are necessary. First of all, you saw that each topic has an accompanying topic identifier. Topic identifiers are of special importance when establishing relationships between two or more topics (that is, creating associations) as you need to provide the necessary topic identifiers when creating the association. Don’t worry about having to remember the exact topic identifiers as PerfectLearn has autocomplete functionality in place for all of the fields that require a topic identifier making it very easy to select the appropriate topic.

Secondly, it is important to understand that a newly created topic or a topic that you navigate to becomes the current topic. With PerfectLearn you are always working within the context of the current topic. You can edit and view the current topic’s text, view and download the current topic’s images and files, or view the current topic’s associations and web queries. Finally, and this is an important point, you create associations between the current topic (also known as, the source topic) and another topic (the so-called, destination topic). Creating simple and non-trivial associations will be the subject of another tutorial.

After creating the topic, let’s take a look at the various web queries that PerfectLearn can perform for a topic. To see the results of the web queries, click on the Web queries dropdown and select the View menu item. PerfectLearn will retrieve all of the related articles from Freebase and display them in a list. Click on the Preview link of the corresponding article to see a preview of the article. If you want to see the full Wikipedia article just click on the See Wikipedia article link in the preview window.

Next on our list of web queries is the video query. Click on the Video tab to display the list of related YouTube videos for the topic. Clicking on the View link (or the video’s thumbnail) will open a video player in a separate window for you to view the video.

After the video query, we will take a look at the images query. Click the Images tab to see the most relevant Flickr images for the topic. Clicking on a thumbnail will load the corresponding image in its original size in a separate window. You can view the previous and next image by clicking on the left and right arrows, respectively. You can also view a slideshow of all of the images, by clicking on the Play button at the top of the screen when viewing an image.

Finally, PerfectLearn also provides a web query for news stories. Click on the News tab to see a chronologically sorted list of news stories related to the current topic. Clicking on a link will open the corresponding news story in a new browser tab.

This concludes the first brief overview of PerfectLearn. Stay tuned for more tutorials and screencasts and sign up for the newsletter and get the latest in updates. Subscribe to the PerfectLearn newsletter.