PerfectLearn beta period uptime report

PerfectLearn’s overall uptime (Pingdom’s summary is provided below) for the month of March was 99.51%. This uptime also coincided with PerfectLearn’s beta period. So, on the whole, I’m pretty satisfied with PerfectLearn’s stability during its beta phase.

PerfectLearn Pingdom Report March 2015

PerfectLearn Pingdom Report March 2015

All of the downtimes except the downtime on March 09 where scheduled downtimes for deployment purposes (the roll-out of bug fixes and functionality enhancements). The "3h 20m" downtime on March 09, however, was due to my VPS hosting provider scheduling a Xen security-related update with a mandatory reboot.

The apparent stability of PerfectLearn makes me confident that version 1.0 of PerfectLearn is ready to be formally released.

Stay tuned for updates. Subscribe to the PerfectLearn newsletter.

Thinking about business models

PerfectLearn’s beta phase is coming to an end. It has been thoroughly tested (resulting in a substantial amount of bug fixes and several user-facing improvements) and I have a clear understanding of the path ahead of me in terms of the application’s future development. As planned, in April I will make PerfectLearn available to anyone who wants to use it.

Creative Space

Creative Space

Product development

Arriving at this stage of PerfectLearn’s development also means that I really need to start thinking about where I want to go from here.

Product development principles

  • Don’t build technology for the sake of technology
  • Solve for patterns, not for instances
  • Always ask, “what is the goal?” and “what are you optimizing for?”
  • Look for leverage, but always invest in your core
  • Know your core

The above principles should be dear to any developer’s heart. Always, at least, try to keep them in the back of your mind when embarking on a new software-related project. With regards to PerfectLearn, although it is based on the topic maps paradigm1 —one of my favorite meta data models— I definitely did not build it just for the sake of using topic maps. A topic maps-based approach lends itself very well to providing structure, what is for all intents and purposes, unstructured data. In PerfectLearn’s case, the unstructured information is the personal knowledge of an individual.

In addition, topic maps is an enabling technology allowing for the relatively straightforward development of many downstream products. Besides, using topic maps as the core of your software allows you to solve a wide variety of (data-modeling) problems in a standardized way. The side-effect of having to develop a topic map engine for PerfectLearn is that once the engine has been built it becomes possible to solve for data-modeling patterns and not just specific instances of those patterns.

Finally, with regards to the above principles, from the point of view as a software architect and developer, my “core” is semantic technologies in general and the topic maps paradigm, in particular (with the risk of becoming a one-trick pony).

So, what is this all leading up to? Well, as I mentioned above, topic maps is an enabling technology. Topic maps really can be used as the foundation of many software solutions. In that respect, I have been exploring different areas where a topic map-based approach could contribute substantially to a product’s value proposition2 and I think I have identified several interesting options3. So, this is the point where I need to step back and truly consider how I want to proceed.

Opportunity cost

A fundamental and very powerful concept in microeconomics is opportunity cost. Opportunity cost basically equates to the value of the best alternative that you did not pursue when faced with several mutually exclusive alternatives given limited resources. In my case, the limited resource is time. That is, I can only focus (properly) on one thing at a time. And by doing so, I forgo the other options and will have to pay the accompanying opportunity cost. This means that I need to have some kind of method to determine which alternative really is the best one. So, the method I have devised to provide me with an initial approximation of business model attractiveness revolves around two main variables: potential payoff (a logarithmic scale from 1 to 5, from lowest potential payoff to highest potential payoff, respectively) and probability of success (between 0 and 1, impossibility and certainty, respectively).

Evaluation of business models

When applying the above-mentioned method to some of the (business model) opportunities I have identified for both PerfectLearn (the web application) and the set of technologies I have developed as part of the PerfectLearn offering, we get the following:

  1. The PerfectLearn web application with a freemium-based SaaS business model: taking into account the feedback from the individual beta users in combination with several companies expressing a desire to explore using PerfectLearn in different scenarios leads me to believe that PerfectLearn can be productized. The real challenge is the execution of an effective SaaS-oriented business model and the accompanying issue of customer churn. In that respect, I score this business model as follows: 0.30/3 probability of success and potential payoff, respectively. Like other SaaS offerings, this specific business model follows the High Consumer Intent for Commercial Transaction/Low Traffic pattern.
  2. The PerfectLearn online academy/digital content monetization business model: After having researched course design in combination with my own experiments with online teaching, I believe that PerfectLearn’s versatility makes it a viable tool for course instructors to use for planning, building, organizing, and presenting (online) courses in conjunction with the monetization of the course’s digital content in the form of curated topic maps, screen casts, and eBooks. The business model for the monetization of high-quality educational-related digital content is one that I believe that I understand well enough to have a real chance of success. In that respect, I score this business model as follows: 0.60/2 probability of success and potential payoff, respectively. This business model follows the same High Consumer Intent for Commercial Transaction/Low Traffic pattern as the SaaS model outlined above. Finally, an additional advantage of pursuing this business model is that it also provides me with a viable user on-boarding process for the PerfectLearn SaaS offering. That is, some of the pupils would convert to PerfectLearn users because of their exposure to PerfectLearn as a central part of the course offerings which implies that I should probably include the PerfectLearn SaaS offering as part of this business model as a value-added service.
  3. The PerfectLearn on-device topic map engine business model: As part of building PerfectLearn, I have also developed a native on-device topic map engine for Android (the iOS port is in a very early stage of development). Having an on-device topic map engine opens up numerous possibilities with regards to building compelling and feature-rich mobile apps that can function equally well in both online and offline modes without sacrificing any advanced capabilities4. In many developing countries, intermittent internet access allows people to download your app but not to actually interact with the app’s content and functionality in a consistent manner, resulting in a degraded user experience. Furthermore, when (open) WiFi access is not available, expensive mobile data plans make off-line content delivery platforms attractive. Taking the above into account, I have done extensive research of the mobile space and have concluded that there are several app categories that are perfect targets for topic map-based solutions. In comparison to the other alternatives that I’ve outlined above, this is the outlier business model. A freak. Also, let’s be honest. Any business model targeting the mobile space has to provide a very compelling value proposition in combination with an innovative bring-to-market approach. And, even so … the chance of success if small, to say the least. In that respect, I score this business model as follows: 0.15/5 probability of success and potential payoff, respectively. The reason why I have scored this business model with a relatively high probability of success (0.15) is that I believe that I have genuinely identified an app category that is ripe for (dare I say it, the most over-used word in startup marketing) disruption. Obviously, the potential payoff in the mobile app space is huge if you get it right. That’s why this business model, in comparison to the other two models, has been given the highest potential payoff score. Finally, this business model follows the High Consumer Intent for Commercial Transaction/High Traffic pattern.
Business Model Success/Payoff Matrix

Business Model Success/Payoff Matrix

Final words

It’s possible to develop a myriad of different business models for each of your software products and the above method is only one of many frameworks that you could use when evaluating the different (software) business models.

Software business models allow you to think about your software on a different level. But ultimately, bringing software to the market in a successful manner depends on numerous factors. You need to ensure that you consider all of these aspects in a structured way and then choose the business model that best addresses the challenges that you will face when developing and marketing your software.

Stay tuned for updates. Subscribe to the PerfectLearn newsletter.


  1. My other favorite meta data model is The Universal Design Pattern.
  2. Using topic maps as the underlying data model for a product should be an implementation detail and completely transparent to the user. From a user’s point of view, the superior contextual experience that topic maps make possible should just work as if it were magic.
  3. No, I don’t think that this is an example of "a solution trying to find a problem". A topic map-based architecture can substantially improve the experience of many software products.
  4. Revenue forecasts for Augmented Reality (AR) could hit $120 billion by 2020 (VR forecasts at $30 billion). Imagine a hybrid cloud/on-device topic map engine as the Point-of-Interest (POI) database for AR mobile apps. I would be willing to bet the house on this one.

PerfectLearn development update March 2015

In the last two weeks only one big(ish) change has been implemented and deployed. All the other changes to PerfectLearn have been minor user interface-related tweaks and fixes. The big change was to PerfectLearn’s editor component. Previously, PerfectLearn was using the Bootstrap-wysihtml5 editor. Bootstrap-wysihtml5 is a reasonable editor. Nonetheless, in retrospect, it has proven to not be up to the task of serious text editing. In many respects, it is a decidedly liteweight editor. So, after discussing this issue with some of the more active beta users, I decided to swap it for a Markdown-based editor.

PerfectLearn Markdown editor

PerfectLearn Markdown editor

The new editor has some neat functionality, including the ability to preview the resulting HTML before saving the topic and a full-screen option (which I find particularly useful).

PerfectLearn full-screen Markdown editor

PerfectLearn full-screen Markdown editor

Based on the feedback from the users and my own impression when using PerfectLearn, I’m convinced that replacing the editor, even at this late stage of the beta phase, was the right thing to do.

Stay tuned for updates. Subscribe to the PerfectLearn newsletter.

Teaching with PerfectLearn

PerfectLearn’s beta testing phase is coming to an end. That is, in April I will officially launch PerfectLearn. What this means is that I can start using PerfectLearn in another online project of mine: YouProgramming.

Creative Space

Creative Space

YouProgramming is a project that I started well over a year ago (and then put on hold while I was building PerfectLearn). It’s both a website and an accompanying YouTube channel aimed at teaching people how to program. The various programming courses (for both the Java and Python programming languages) consist of a series of screencasts and supporting materials in the form of PDF files.

PerfectLearn is unique in that it can be used by teachers and students to help organize their teaching materials and personal knowledge, respectively.

PerfectLearn is unique in that it can be used by teachers and students to help organize their teaching materials and personal knowledge, respectively. What’s more, PerfectLearn’s versatility makes it an ideal tool for course instructors to use for planning, building, and organizing (online) courses. One of the features that I am already adding to PerfectLearn to make it more suitable as a teaching tool is the ability to (partially) automate the generation of the accompanying learning materials for each part of the course, as an ebook, based on a selection of topics stored in PerfectLearn.

What this all means is that over the course of the next couple of weeks I will start building courses using PerfectLearn followed by publishing those courses on both YouProgramming’s and PerfectLearn’s YouTube channels. I will keep you all posted as to my progress.

Stay tuned for updates. Subscribe to the PerfectLearn newsletter.

PerfectLearn Version 1.0 Beta User Feedback

Two weeks ago, I published the beta version of PerfectLearn and asked a number of people who had expressed a desire to try PerfectLearn to start testing it. These two weeks have been quite a ride. But, first things first. The quality of the feedback from the beta users has been nothing short of fantastic. Thank you.

The quality of the feedback from the beta users has been nothing short of fantastic.

Several bugs have been found and fixed. One of the bugs, specifically, turned out to be quite a nasty one. Luckily, switching to the PostgreSQL database has fixed the issue completely.

Using PerfectLearn

Using PerfectLearn

In addition to fixing the bugs, I have also decided to implement some of the suggestions based on feedback from the last two weeks. Two features specifically, have already been implemented: the (inline) quick help option and an interactive component for displaying related topics. A short screencast demonstrating the two features is available on YouTube, here.

Both features were straight-forward to implement and provide value to the user. On the one hand, the quick help option helps users to familiarize themselves with some of the applications’s more prominent user interface elements. On the other hand, the second feature, the related topics component, improves the user experience by showing only the necessary navigational information —the topic context— while at the same time making it possible to interact with said navigational information. In this respect, it’s important to note that the previous version of PerfectLearn displayed related topics, as well. However, it did this in a completely static way not allowing for the user to filter the related topics by association type and member role, respectively.

Finally, I hope (and expect) to finish PerfectLearn’s beta testing within the next fortnight. I feel privileged to have the users I have. I only hope that PerfectLearn lives up to your expectations.

Stay tuned for updates. Subscribe to the PerfectLearn newsletter.

An effective teaching tool

Recently, I published an article in which I outlined the process of importing The CIA World Factbook into PerfectLearn. Together with the article I also uploaded a screencast to YouTube showing the result of the import. What I didn’t expect was the nature of the feedback that the article, and more specifically the screencast, would generate.

PerfectLearn: an effective teaching tool

PerfectLearn: an effective teaching tool

Several people have contacted me asking if PerfectLearn, together with a pre-loaded version of the CIA World Factbook data, could be made available for teaching purposes. That is, people saw that a tool like PerfectLearn in combination with a compelling dataset can be be used as an effective teaching tool. In some respects, this surprised me.

Since I started building PerfectLearn I have consistently focused on a very specific type of user: the individual learner.

Since I started building PerfectLearn I have consistently focused on a very specific type of user: the individual learner. I always imagined PerfectLearn to be used like how I use it; that is, as a tool to help an individual manage their personal knowledge. I definitely did not picture PerfectLearn being used within a group setting as a tool for a teacher to complement and enhance the teaching process.

At this stage in PerfectLearn’s development, I still think that it is of vital importance to maintain the focus on the individual learner. Nevertheless, this insight into using PerfectLearn as a teaching tool has provided me with several ideas on how to adapt PerfectLearn to make it a useful companion for teachers to help them enable even better learning experiences. In that respect, your feedback in the form of suggestions, comments, and ideas are more than welcome.

Stay tuned for updates. Subscribe to the PerfectLearn newsletter.

How I used the CIA World Factbook to test my product

In preparation for the release of the first version of PerfectLearn, testing is the order of the day. To make the testing process both more realistic and more enjoyable I decided to load an external dataset into PerfectLearn to see how it handled a non-trivial topic map.


Screencast showing the CIA World Factbook data after it has been imported into PerfectLearn.

After searching online for a couple of hours I finally settled on the CIA World Factbook which in its own words “provides information on the history, people, government, economy, geography, communications, transportation, military, and transnational issues for 267 world entities.” All in all, the World Factbook is an interesting dataset that the CIA has made available for personal use.

The first thing to do when confronted with a task like this is to try to get a basic understanding of the nature of the data.

The first thing to do when confronted with a task like this is to try to get a basic understanding of the nature of the data. After examining the contents of the decompressed factbook.zip file I concluded that the following files and directories were sufficient to extract the necessary information to build the initial topic map ontology with some supporting images for each country’s topic:

  • geos
    • *.html: HTML documents for the 267 world entities.
    • print/country/*.pdf: the corresponding PDF documents for the 267 world entities.
  • graphics
    • flags/large/*.gif: country flags in GIF format.
    • maps/newmaps/*.gif: country maps in GIF format.
  • wfbExt
    • sourceXML.xml: XML file mapping country names, codes, and the corresponding regions.
CIA World Factbook Directory

CIA World Factbook Directory

There really is much more data available in the World Factbook than what I am alluding to. For example, in the fields and rankorder directories there is all kinds of data related to country comparisons (within several categories) and the appendix directory contains information about international organizations and groups, international environmental agreements, and so forth. Furthermore, there are both physical and political maps and population pyramids (in BMP format!) for all of the countries and territories. That is, the World Factbook is comprehensive to say the least.

With an initial understanding of the data the next step is to extract the information that is relevant for the current purpose. The HTML files in the geos directory provide the majority of the actual content for the countries, territories, and regions. In addition, the wfbExt/sourceXML.xml file (an excerpt of which is provided below) provides a convenient mapping between the countries and accompanying regions. That is, each country record in the sourceXML.xml file includes “name”, “fips”, and “Region” attributes which effectively links countries with regions while also providing the country code (the fips field) for the individual countries (and territories). The sourceXML.xml file will be crucial in the next phase when we are actually importing data into the topic map. For now, however, we need to focus on extracting the text for each country’s topic.

sourceXML.xml file excerpt

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<country>
	<country name="Afghanistan" fips="AF" Region="South Asia" />
	<country name="Akrotiri" fips="AX" Region="Europe" />
	<country name="Albania" fips="AL" Region="Europe" />
	<country name="Algeria" fips="AG" Region="Africa" />
	<country name="American Samoa" fips="AQ" Region="Oceania" />
	<country name="Andorra" fips="AN" Region="Europe" />
	<country name="Angola" fips="AO" Region="Africa" />
	<country name="Anguilla" fips="AV" Region="Central America" />
    ...
</country>

To painlessly extract data from HTML I normally resort to Apache Tika. Apache Tika is a Java library that makes it easy to extract meta data and text from numerous different file types, including (but not limited to) PDFs, Word files, Excel files, PowerPoint files, and, in this case, HTML files.

All in all, only two (Groovy) scripts are required to extract the text from HTML files and import the data into PerfectLearn while at the same time creating the necessary relationships between the topics. What the first script, Extract.groovy (provided below), does is relatively straightforward. First of all, it imports the necessary Apache Tika classes (lines 7-11), defines the source and target paths for the directory with the original HTML files and the directory to write the text files with the extracted text (lines 14-19), followed by creating the target directory (line 25). Next, the extraction of text from the HTML files starts by iterating over all of the HTML files (in the source directory) and calling the extractContent function to actually extract the textual content from each of the HTML files which is subsequently written to a file in the processed directory (lines 27-39). The extractContent function is the most complex code in this script but all it does is request Tika to return the content of the document’s body as a plain-text string by removing all the HTML-related markup (lines 45-65) after which the extracted text is passed to the sanitize function (lines 71-79) to remove superfluous text and to inject some markup to ensure better legibility of the text when it’s finally rendered in PerfectLearn. As you can see, Tika is doing the vast majority of the heavy lifting in this script.

Extract.groovy

/*
Extract country text script (from accompanying HTML files)
By Brett Alistair Kromkamp
January 09, 2015
*/

import org.apache.tika.Tika
import org.apache.tika.metadata.Metadata
import org.apache.tika.parser.html.HtmlParser
import org.apache.tika.parser.ParseContext
import org.apache.tika.sax.BodyContentHandler

// ***** Constants *****
final def ORIGINAL_PATH = '/home/brettk/Source/groovy/perfectlearn-miscellaneous/cia-factbook/data/original/geos'
final def PROCESSED_PATH = '/home/brettk/Source/groovy/perfectlearn-miscellaneous/cia-factbook/data/processed/geos'

// ***** Setup *****
def originalDirectory = new File(ORIGINAL_PATH)
def processedDirectory = new File(PROCESSED_PATH)

// ***** Logic *****
println 'Starting extraction process.'

// Create 'processed' directory.
processedDirectory.mkdirs() // Non-destructive.

originalDirectory.eachFile { file ->
    if (file.isFile() && file.name.endsWith('.html')) {
        def textFileName = generateTextFileName(file.name.toString())

        // Create file with extracted text.
        def textFile = new File("$PROCESSED_PATH/$textFileName")
        textFile.withWriter { out ->
            def textContent = extractContent(file.text)
            println textFileName
            out.writeLine(textContent)
        }
    }
}

println 'Done!'

// ***** Helper functions *****

String extractContent(String content) {
    BodyContentHandler handler = new BodyContentHandler()
    Metadata metadata = new Metadata()
    InputStream stream

    def result = ''
    try {
        if (content != null) {
            stream = new ByteArrayInputStream(content.getBytes())
            new HtmlParser().parse(
                stream, 
                handler, 
                metadata, 
                new ParseContext())
            result = sanitize(handler.toString()).trim()
        } 
        return result
    } finally {
        stream.close()
    }
}

String generateTextFileName(String htmlFileName) {
    return htmlFileName.replaceAll(~/\.html/, '') + '.txt'
}

String sanitize(String content) {
    return content
        .replaceAll(~/(?m)^\s+/, '')
        .replaceAll(~/(?s)^Javascript.*Introduction ::/, 'Introduction ::')
        .replaceAll(~/(?s)EXPAND ALL.*/, '')
        .replaceAll(~/(?m)^([A-Z].*\s+)::.*/, '<h2>$1</h2>')
        .replaceAll(~/(?m)^([a-z])([a-z|\s]*):/, '<strong>$1$2</strong>: ')
        .replaceAll(~/(?m)^([A-Z])([a-z|\s|-]*):/, '<h3>$1$2</h3>')
}

The next script, Import.groovy (provided below), although longer than the previous one, is relatively straightforward, as well. The important thing to realize with this script is that its main function is to iterate over the, previously mentioned, sourceXML.xml file to create and store the countries, territories, and regions (as topics) in the topic map. First of all, the script imports the necessary Java libraries including the PerfectLearn topic map engine (lines 8-17), followed by setting up the necessary constants for paths, database-related parameters, and other miscellaneous values (lines 21-32). The next thing it does is instantiate the PerfectLearn topic map engine (line 38) and creates some required topics for the World Factbook topic map ontology (lines 42-56). Once the necessary topics have been created, the sourceXML.xml is loaded and the country/territory/region records are read into a list (lines 62-64) for subsequent iteration (line 69). On each iteration the following actions are performed:

  • The required region identifier, region name, country identifier, country name, and country code are extracted for subsequent use (lines 72-77).
  • The textual content for each country, territory, or region is retrieved from the appropriate text file that was generated by the Extract.groovy script (lines 81-86).
  • The background information, excerpt, and timeline year are retrieved for each country, territory, or region to create the necessary meta data for subsequent display in the timeline component (lines 90-105, and lines 256-272, 274-283, 285-290, for the getBackgroundExcerpt, getBackground, and getTimelineYear functions, respectively).
  • The country or territory topic is created and stored (lines 109-114).
  • The country or territory text occurrence is created and stored (lines 118-126).
  • The region topic is created and stored (lines 130-137).
  • The association (that is, relationship) between a country or territory and its concomitant region is stored (line 141).
  • Coordinates are extracted from the country’s textual content and if the second set of coordinates is present (for the capital city), the meta datum with the coordinates is created and stored for subsequent visualization in the map component. The convertToDdCoordinates function is called with the extracted coordinates to convert from a degrees-minutes-seconds format to a decimal degrees format which is the required format for Google Maps (lines 145-149, and lines 241-254 for the convertToDdCoordinates function).
  • A link (occurrence) is added for each country, territory, or region pointing back to the appropriate page in the CIA World Factbook website (lines 153-161).
  • The flag (occurrence) is added for each country (lines 165-179, and lines 292-301 for the copyFile function).
  • The map (occurrence) is added for each country or territory (lines 185-195, and lines 292-301 for the copyFile function).

Next, the associations to establish the appropriate relationships between the regions themselves and between the regions and the "world" (topic) are created and stored in the topic map (lines 201-219). Finally, the textual content for the world topic is retrieved (lines 223-226) and the accompanying occurrence is created and saved (lines 228-236).

Import.groovy

/*
Import CIA World Factbook into PerfectLearn Topic Map Engine
By Brett Alistair Kromkamp
January 15, 2015
*/

// Import necessary Java libraries including the PerfectLearn topic map engine.
import com.polishedcode.crystalmind.base.Utils
import com.polishedcode.crystalmind.base.Language;
import com.polishedcode.crystalmind.map.store.TopicStore;
import com.polishedcode.crystalmind.map.store.TopicStoreException;
import com.polishedcode.crystalmind.map.model.*

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

// ***** Constants *****
// Setup necessary paths, database-related parameters, and other miscellaneous values.
final def COUNTRIES_PATH = '/home/brettk/Source/groovy/perfectlearn-miscellaneous/cia-factbook/data/original/wfbExt/sourceXML.xml'
final def MAPS_PATH = '/home/brettk/Source/groovy/perfectlearn-miscellaneous/cia-factbook/data/original/graphics/maps/newmaps'
final def FLAGS_PATH = '/home/brettk/Source/groovy/perfectlearn-miscellaneous/cia-factbook/data/original/graphics/flags/large'
final def PROCESSED_PATH = '/home/brettk/Source/groovy/perfectlearn-miscellaneous/cia-factbook/data/processed/geos'

final def DATABASE = 'pldb_1'
final def SHARD_INFO = "localhost;3306;${DATABASE}"
final def USERNAME = '********'
final def PASSWORD = '********'
final long TOPIC_MAP_IDENTIFIER = 64L
final def COUNTRIES_TOTAL = 268
final def UNIVERSAL_SCOPE = '*'

// ***** Logic *****
println 'Starting importing process.'

// Instantiate the PerfectLearn topic store.
TopicStore topicStore = new TopicStore(USERNAME, PASSWORD)

// Bootstrap required topics.
println 'Bootstrapping...'
def bootstrapTopics = [
	new Entity(identifier: 'country', name: 'Country', instanceOf: 'topic'),
	new Entity(identifier: 'region', name: 'Region', instanceOf: 'topic'),
	new Entity(identifier: 'world', name: 'The World', instanceOf: 'topic'),
	new Entity(identifier: 'part-of', name: 'Part Of', instanceOf: 'topic')
]

bootstrapTopics.each { bootstrapTopic ->
	Topic topic = new Topic(
		bootstrapTopic.identifier,
		bootstrapTopic.instanceOf,
		bootstrapTopic.name, 
		Language.EN)
	topicStore.putTopic(SHARD_INFO, TOPIC_MAP_IDENTIFIER, topic, Language.EN)
}

/*
Iterate over country records (in sourceXML.xml) by extracting the necessary
attributes to create countries, territories, and regions.
*/
def countriesContent = new File(COUNTRIES_PATH).text
def countriesXml = new XmlSlurper().parseText(countriesContent)
def countries = countriesXml.country

assert COUNTRIES_TOTAL == countries.size()

println 'Iterating over countries...'
for (country in countries) {
	// For each country/territory/region extract the region identifier, 
	// region name, country identifier, country name, and country code. 
	def regionIdentifier = Utils.slugify(country.@Region.text())
	if (regionIdentifier) {	
		def regionName = country.@Region.text()
		def countryIdentifier = Utils.slugify(country.@name.text())
		def countryName = country.@name.text() 
		def countryCode = country.@fips.text().toLowerCase()

		// Get topic's text.
		println "Getting topic's text..."
		def topicContentPath = "$PROCESSED_PATH/${countryCode}.txt"
		def topicContentFile = new File(topicContentPath)
		def topicContent = ''
		if (topicContentFile.exists()) {
			topicContent = topicContentFile.text
		}

		// Extract the country's background excerpt.
		println "Extracting the country's background excerpt..."
		if (topicContent) {
			def excerpt = getBackgroundExcerpt(topicContent)
			def background = getBackground(topicContent)

			// Add the appropriate timeline related meta data.
			println 'Adding the timeline metadata...'
			if (excerpt && background) {
				def timelineYear = getTimelineYear(background)
				def timelineMedia = "<blockquote>${excerpt.find(~/(?s)^\S*^(.*?)[.?!]\s/).trim()}<blockquote>".toString()
				if (timelineYear && timelineMedia && excerpt) {
					topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'timeline-event-startdate', timelineYear, countryIdentifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)
					topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'timeline-media', timelineMedia, countryIdentifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)
					topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'timeline-text', excerpt, countryIdentifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)
				}
			}
		}

		// Create and store the country or territory topic.
		println 'Creating and storing the country topic...'
		Topic countryTopic = new Topic(
			countryIdentifier,
			'country',
			countryName, 
			Language.EN)
		topicStore.putTopic(SHARD_INFO, TOPIC_MAP_IDENTIFIER, countryTopic, Language.EN)

		// Create and store the topic's text occurrence.
		println "Creating and storing the topic's text..."
		Occurrence occurrence = new Occurrence(countryIdentifier)
		occurrence.with {
			instanceOf = 'text'
			scope = UNIVERSAL_SCOPE
			language = Language.EN
			resourceData = topicContent.getBytes()	
		}
		topicStore.putOccurrence(SHARD_INFO, TOPIC_MAP_IDENTIFIER, occurrence)
		topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'label', countryName, occurrence.identifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)

		// Create and store the region topic.
		println 'Creating and storing the region topic...'
		if (!topicStore.topicExists(SHARD_INFO, TOPIC_MAP_IDENTIFIER, regionIdentifier)) {
			Topic regionTopic = new Topic(
				regionIdentifier,
				'region',
				regionName, 
				Language.EN)
			topicStore.putTopic(SHARD_INFO, TOPIC_MAP_IDENTIFIER, regionTopic, Language.EN)
		}

		// Create associations between countries and regions.
		println 'Creating associations between countries and regions...'
		topicStore.createAssociation(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'country', countryIdentifier, 'region', regionIdentifier)

		// Create coordinates metadatum for each country's capital.
		println "Creating coordinates for country's capital..."
		def coordinates = topicContent.findAll(~/(?m)(^[-+]?\d{1,2}\s*\d{1,2}\s*[A-Z]),\s*([-+]?\d{1,2}\s*\d{1,3}\s*[A-Z])/)
		if (coordinates[1]) {
			ddCoordinates = convertToDdCoordinates(coordinates[1])
			topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'map-coordinates', ddCoordinates, countryIdentifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)
		}

		// Add link occurrence to each topic pointing to the original CIA World Factbook country page. 
		println 'Adding CIA World Factbook country page link...'
		Occurrence linkOccurrence = new Occurrence(countryIdentifier)
		linkOccurrence.with {
			instanceOf = 'url'
			scope = UNIVERSAL_SCOPE
			language = Language.EN
			resourceRef = "https://www.cia.gov/library/publications/the-world-factbook/geos/${countryCode}.html"
		}
		topicStore.putOccurrence(SHARD_INFO, TOPIC_MAP_IDENTIFIER, linkOccurrence)
		topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'label', "$countryName CIA World Factbook Page", linkOccurrence.identifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)

		// Add flag (occurrence) to each topic and copy image to appropriate (web application resources) directory.
		println 'Adding flag...'
		def imageDirectoryName = "/home/brettk/www/static/$TOPIC_MAP_IDENTIFIER/images/$countryIdentifier"
		def imageDirectory = new File(imageDirectoryName)
		imageDirectory.mkdirs() // Non-destructive.

		def serverImageDirectoryName = "/static/$TOPIC_MAP_IDENTIFIER/images/$countryIdentifier"
		
		Occurrence flagOccurrence = new Occurrence(countryIdentifier)
		flagOccurrence.with {
			instanceOf = 'image'
			scope = UNIVERSAL_SCOPE
			language = Language.EN
			resourceRef = "$serverImageDirectoryName/${flagOccurrence.identifier}.gif"
		}
		topicStore.putOccurrence(SHARD_INFO, TOPIC_MAP_IDENTIFIER, flagOccurrence)
		topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'label', "$countryName (Flag)", flagOccurrence.identifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)

		copyFile("$FLAGS_PATH/${countryCode}-lgflag.gif", "$imageDirectoryName/${flagOccurrence.identifier}.gif")

		// Add map (occurrence) to each topic.
		println 'Adding map...'
		Occurrence mapOccurrence = new Occurrence(countryIdentifier)
		mapOccurrence.with {
			instanceOf = 'image'
			scope = UNIVERSAL_SCOPE
			language = Language.EN
			resourceRef = "$serverImageDirectoryName/${mapOccurrence.identifier}.gif"
		}
		topicStore.putOccurrence(SHARD_INFO, TOPIC_MAP_IDENTIFIER, mapOccurrence)
		topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'label', "$countryName (Map)", mapOccurrence.identifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)

		copyFile("$MAPS_PATH/${countryCode}-map.gif", "$imageDirectoryName/${mapOccurrence.identifier}.gif")
	}
}

// Create associations between regions.
println 'Creating associations between regions...'
def regionIdentifiers = [
	'africa',
	'central-america',
	'central-asia',
	'east-asia',
	'europe',
	'middle-east',
	'north-america',
	'oceania',
	'south-america',
	'south-asia'
]
for (outerRegionIdentifier in regionIdentifiers) {
	for (innerRegionIdentifier in regionIdentifiers.findAll { it != outerRegionIdentifier } ) {
		topicStore.createAssociation(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'region', outerRegionIdentifier, 'region', innerRegionIdentifier)
	}
	// Create associations between the world topic and the regions.
	topicStore.createAssociation(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'part-of', 'world', 'region', outerRegionIdentifier)
}

// Add the appropriate text occurrence ('xx.txt') to the 'world' topic.
println "Adding text occurrence to the 'World' topic..."
def worldTopicContentFileName = "${PROCESSED_PATH}/xx.txt"

def worldTopicContentFile = new File(worldTopicContentFileName)
def worldTopicContent = worldTopicContentFile.text

Occurrence worldOccurrence = new Occurrence('world')
worldOccurrence.with {
	instanceOf = 'text'
	scope = UNIVERSAL_SCOPE
	language = Language.EN
	resourceData = worldTopicContent.getBytes()
}
topicStore.putOccurrence(SHARD_INFO, TOPIC_MAP_IDENTIFIER, worldOccurrence)
topicStore.createMetadatum(SHARD_INFO, TOPIC_MAP_IDENTIFIER, 'label', 'world', worldOccurrence.identifier, Language.EN, '', DataType.STRING, UNIVERSAL_SCOPE)

println 'Done!'

// ***** Helper methods *****
def convertToDdCoordinates(String dmsCoordinates) { // Format: 17 49 S, 31 02 E
	// http://en.wikipedia.org/wiki/Geographic_coordinate_conversion
	def parts = dmsCoordinates.replace(',', '').split(' ')

	def ddLatitude = parts[0].toInteger() + (parts[1].toInteger() / 60) 
	if (parts[2] == 'S') {
		ddLatitude = 0 - ddLatitude
	}
	def ddLongitude = parts[3].toInteger() + (parts[4].toInteger() / 60)
	if (parts[5] == 'W') {
		ddLongitude = 0 - ddLongitude
	}
	return "($ddLatitude, $ddLongitude)"
}

def getBackgroundExcerpt(String content) {
	def result = content
		.find(~/(?s)<\/h3>.*<h2>Geography/)
		?.replaceAll(~/<\/h3>/, '')
		?.replaceAll(~/<h2>Geography/, '')
	if (result) {
		if (result.size() > 320) {
			result = result[0..320]
		}
		if (result[-1] != '.') {
			result = result << '...'
		}
	} else {
		result = ''
	}
	return result.toString()
}

def getBackground(String content) {
	def result = content
		.find(~/(?s)<\/h3>.*<h2>Geography/)
		?.replaceAll(~/<\/h3>/, '')
		?.replaceAll(~/<h2>Geography/, '')
	if (result == null) {
		result = ''
	}
	return result
}

def getTimelineYear(String content) {
	def bcYears = content.findAll(~/\d{4}\sB.C./).collect { it.replace(' .B.C.', '') }
	def adYears = content.findAll(~/\d{4}/)
	def years = adYears - bcYears
	return years[0]
}

def copyFile(String sourcePath, String targetPath) {
	Path source = Paths.get(sourcePath)
	Path destination = Paths.get(targetPath)

	try {
		Files.copy(source, destination);
	} catch (IOException e) {
		e.printStackTrace();
	}
}

// ***** Models *****

class Entity {
	String identifier
	String name
	String instanceOf
}

And that’s it, folks! In a follow-up article I will document how to improve the import process outlined in this article to make much better use of the resources provided by the World Factbook. However, on this first iteration, the current import process provides me with sufficient data to thoroughly test PerfectLearn with a non-trivial topic map.

Stay tuned for updates. Subscribe to the PerfectLearn newsletter.

Multiple projects in PerfectLearn

One of the main reasons for building PerfectLearn is to use it myself. I genuinely find it useful to employ a topic map-based approach to organize my personal knowledge. Having successfully used PerfectLearn’s predecessor, QueSucede.com, as an online personal knowledge base for the past seven years has convinced me of the utility of an application that helps a user to manage their (documented) knowledge and to turn it into a tangible thing of value.

Learning and Creativity

Learning and Creativity

When thinking about things that would make PerfectLearn even more useful I only have to examine the pain points I am experiencing when using the application. Currently, one of the bigger "problems" I see with PerfectLearn is the issue of one topic map per user. That is, when a user signs up to use PerfectLearn, the application creates a topic map for that user. That is, each user gets one, and only one, topic map. And that, my friends, is a limitation.

When looking at my own needs, I see that I want to be able to create multiple independent topic maps to manage unrelated projects. For example, if you are a student using PerfectLearn, I can imagine you creating a specific topic map for your thesis and creating other topic maps for, well, other purposes. What this means is that PerfectLearn needs to have the ability for the user to create, select, and manage multiple projects where each project is a self-contained topic map isolated from the user’s other topic maps. In retrospect, I consider this to be an essential feature of PerfectLearn and will start implementing it as soon as PerfectLearn version 1.0 has been released.

If you have any suggestions with regards to the project feature let me know by submitting the feedback form.

Update (January 11, 2015): After some more consideration, I have decided to implement the project feature before launching version 1.0 of PerfectLearn. The main reason for doing so is that the implementation of this feature involves changing the topic map’s database definition. Doing this change after launching PerfectLearn would require potentially quite tricky migration of user data from the previous database definition to the new database definition with the accompanying downtime and risk of data loss. All in all, I don’t expect the implementation of the project feature and subsequent testing to significantly delay the launch of PerfectLearn.

Stay tuned for updates. Subscribe to the PerfectLearn newsletter.

Topic map templates

An application like PerfectLearn can be used in a wide variety of scenarios. Although I normally think of PerfectLearn within the learning space, a friend of mine has been testing PerfectLearn and using the application to help him manage some of his work-related projects. While discussing his experience with PerfectLearn we came up with the idea of, what I am now calling, "topic map templates."

Network Graph

Network Graph

A topic map template, for all intents and purposes, is a predefined set of placeholder topics and the accompanying relationships between those topics. Selecting a topic map template from a list of available templates (from within PerfectLearn) would automatically generate a set of topics and associations after which the user could carry on fleshing-out the topics and the associations as if they had been manually created by the user himself.

Advantages of using a topic map template include saving time (as you don’t have to manually create the topics and the relationships between the topics) and being able to repeatedly create a standardized topic map structure.

In the case of my friend, he found that he was creating the same kind of topics and relationships for the different projects he was managing, over and over again. For example, he was creating the project topic itself, followed by creating topics for the project’s requirements, stakeholders, project deliverables, time frames, and so forth. And after creating the necessary topics he would then have to create the accompanying associations between the topics. All in all, it was an error-prone and time consuming process. A process that, if you think about it, is unnecessary. It is something that can be automated away.

With regards to topic map templates, I envisage the PerfectLearn community submitting templates to a central repository that other PerfectLearn users could browse or search from within the application. When a user finds an appropriate template the user downloads it after which it would be available within PerfectLearn to create the desired topic map structure. To ensure that the user has an accurate overview of what exactly the template provides, on selecting a template, a live visual graph of the template would be displayed together with a description (provided by the user who submitted the template to the repository). Furthermore, the individual templates would be categorized, tagged and user-rated to ensure that the most useful templates are easily found.

Topic map templates could span from the simple (only a handful of topics and associations) to the complex (dozens, or perhaps even hundreds of topics and accompanying associations). In that respect, being able to undo or roll-back applying a template should be possible to ensure that the user is not straddled with topics and associations that in retrospect are not applicable to his or her requirements.

Finally, I expect to start implementing the topic map template feature in version 1.2 of the application. If you have any suggestions with regards to the template feature let me know by submitting the feedback form.

Stay tuned for updates. Subscribe to the PerfectLearn newsletter.

PerfectLearn, the final sprint

Since publishing the PerfectLearn development update on December 06 (2014), the following functionality has been completed:

  • Generate and display a tag cloud based on the user’s tagged topics
  • Edit note
  • Edit URL
  • Edit video link
  • Edit metadatum
  • Numerous minor bug and user-interface fixes
Bokeh Pens by Long Mai (Flickr): http://www.flickr.com/photos/25740835@N08/4377921097/in/photolist-7ERZqe-8BsFs5-9gWAUM-dHW7sD-7X5ABM-b3DidZ-8B616C-awGhso-d45bmf-b8ycpD-dLbKg6-8pBLCf-d2g9Kb-9UE3uM-8JhofC-942FuL-dHW7L8-96dGfb-bBJRGA-aN4z24-dMunkg-dMzVuG-dMun6g-dMunpM-dMumVR-dMzVPw-dMzVGU-dMzVKE-dMumXM-cwTTq1-dKAxSW-e3GcNo-dGSW5V-dMzVEE-9vc6Jr-bjCuBK-9GnevJ-eAE9nj-e34fr4-ebukxw

Bokeh Pens by Long Mai (Flickr)

This means that the topics index and the front-end validation of forms are the only remaining bits of functionality left to implement for version 1.0. As you can see, I’m slightly behind schedule. Nonetheless, I feel that good progress is being made and I also expect to make up some time during the Christmas break.

I also hope to blog on a more regular basis from now until, at least, PerfectLearn has been released.

Thanks for being there for me.

Stay tuned for updates. Subscribe to the PerfectLearn newsletter.