Often times, especially in my consulting days, I found myself driving around in the midst of some big city downtown area desperately hoping to navigate the unfamiliar one-way streets to get someplace on time. Just following along what everyone else was doing hoping the busier main arteries would take me downtown was obviously not such a good choice, that would have gotten me where they were going and not where I needed to be. The plan I settled on that ended up working well was to arrive the evening before, check into my hotel and take a quick drive to the office where I would be working. This “reconnaissance” approach gave me an opportunity to plot my route in a low stress environment without all the rush hour traffic. Then I’d stop for a bite to eat, go back to my room and unwind a bit more before turning in for the night.
This process always made for a much better trip, I was able to drive to the customer office in confidence and arrive in a much more relaxed state in the already uncertain first day on the job. Of course this was all before GPS was common place. I’d probably still do it the same way today though so as to minimize the elements of the unknown in what can already be a stressful situation.
Getting around on the I-Way, especially getting around safely, can benefit from some of that same initial scouting activity. Let’s take a quick trip across the net to a couple of destinations and I’ll act as your GPS guide for these short excursions.
First, let’s take a little walk around the vehicle, our browser, and have a look at the rudimentary elements of how it works. The browser itself is the software you use to pull information from the internet. The browser is what enables us all to visit web pages, see pictures, read text and watch movies. Microsoft and Apple would like you to use the browsers they supply, “Internet Explorer” and “Safari,” but that isn’t a requirement. There are other perfectly good browsers that can be downloaded from the internet like Firefox and Opera. Personally I use Firefox1 because it is independently developed and has protections available more commercially biased software providers aren’t quick to implement, like Locally Stored Object (LSO) or “super cookie” protections.
At the top of every browser is the address bar. This is were the address of a web site is typed in order to access it on the Internet. The “address” of a site is also known by it’s original name “Universal Resource Locator” or “URL” for short.
The address bar was at one time commonly labeled as the “Address” bar. Today browsers may have “Address” or “URL” as the label, or browsers such as “Firefox” and “Internet Explore” might not even label it at all, depending on the version you are running. Address bars do all serve the same general purpose, they allow the end user to tell the browser what they want to see. The address bar also informs the end user of the exact link address for what they are seeing within the browser regardless of how the web page arrived there. So as to not make this a huge dissertation on internet browsers I’ll avoid trying to cover all the other features and functions of browsers such as the buttons of the tool bar, adding “plugins” (or what plugins are for that matter), or the more technical inner-workings. For now let’s just focus on the address bar and be on our way.
Let’s go shopping. I like to shop at “Territory Ahead.” Assuming I don’t already have the address saved aside I’ll type it into my browser’s address bar (TerritoryAhead.com). The browser doesn’t know where to take you based on this address. This address format was created for humans to read, makes remembering locations easier and is called a “Domain Name.” Computers work with numbers so your computer essentially looks in a massive Internet based phone book to translate the location name into the computer equivalent of a phone number. More specifically the computer in front of you places a request across the internet to a “Domain Name Service” to get what is known as an IP address. (IP stands for Internet Protocol) which is in the format of ‘xxx.xxx.xxx.xxx’, unless a new but not yet common standard IP format is used. For simplicity we’ll stick with this currently commonplace format. The IP address for “Whitehouse.gov,” for example, is: ‘184.25.184.110’. No single one of the four numbers making up the IP address number sequences ever exceeds 255. Okay, we’ve got the numeric address, let’s go get our web page.
Once the computer that is hosting the web site out on the Internet receives your computer’s request for the web page it also receives your computers IP address as well. This is so the remote computer knows where to send the requested page. Most sites will also record the IP address for perfectly reasonable statistical analysis such as determining how long a person browses their site. This is considered the site’s “stickiness.”
Now you are at the site and it has loaded up a web page full of colorful images and descriptive text. There is also another address in the address bar. Instead of the “TerritoryAhead.com” value that was typed in originally, it now reads “www.TerritoryAhead.com.” This won’t always happen and if Territory Ahead changes their web site it might not even happen now at the time you are reading this. I’ll go ahead and explain what is happening in this particular case though.
Domain addresses are read by computers from right to left. Let’s take a look at the address as if we were computers then. You’ll notice the addresses, both what I provided and what it changed to has periods in it. These are delimiters used to break the address into levels. In this case “com” is the top level domain. There are a very limited number of top level domains. When the Internet started out there were only 7, gov for government, edu for education, com for commercial, net for network, org for organization, mil for military and arpa which is an internet infrastructure specific top level domain. The next domain level down in this case is “TerritoryAhead” which is the specific reference to the site I was looking for when I first entered the information to my browser. Lastly there is another string of characters, “www” which has appeared. This could effectively stand for “Wild Wild West” which is how the Internet functions today, like the American lawless West of the 1800’s, but it in fact stands for “World Wide Web.” This leftmost string of characters is most commonly a reference to a specific computer within the domain. So what happened in this case is that when I put in the domain name of “TerritoryAhead.com” the traffic cop computer managing the company domain looked at the request and sent my computer the information required to talk directly to the World Wide Web computer for the company. There could very well be an “email.TerritoryAhead.com” computer at the company that handles their email and an “orders.TerritoryAhead.com” computer that handles their orders and on and on it can go. Computers in business are normally named, in one fashion or another, to designate their function.
By clicking on a specific product my address bar is now populated with a much longer string of characters. Regardless of the length of the line in the address bar you see a forward slash immediately following the top level domain. (There is a special case when you might see a colon and a number [:2913 for example] between the top level domain and the slash, but that is just more location information and I’m sure much more than most folks would find interesting for me to explain.) There may even be more than one forward slash. The first forward slash and everything beyond is just instructions to the receiving computer informing it specifically how to locate the information that was requested. As a bit of trivia, if there is a question mark in the line the bit just before is the name of a program the computer hosting the site will have to run to get the information. The rest of it will usually be detail for what the user asked for, like in this case it would be the fact I clicked on a smoke colored after hours crew neck sweatshirt that is being passed along.
Without going into painstaking detail let’s just say I went through the rest of the process to buy the sweatshirt, the order was placed and I have received a receipt in my email. What else happened behind the scenes? A lot. The computer for the company hosting the site itself stored my purchase data, and information related to everywhere I went on the site, was stored in their company data repository and logs. If I made any comments on the site, or entered any other user generated data, that was also stored in their company data repository. Very likely the actual purchase information and personally identifiable data such as credit card information etc. was shipped off via a secure connection to a third party financials processing company along with some bit of information about your account with the company like your login id. Any items remaining in your shopping cart, annotation for when you last visited the site and other relevant info was stored at your browser in what is called a “cookie.”
The entire journey to make this purchase was wrought with hazards. Much like driving to the store, many things can go wrong and there are protections against most of them. Just as in the case of taking a trip to the store there are potential problems with the car itself and to compensate there are lights and sensors all over the car to warn the driver if something is wrong. This is roughly equivalent to the browser being our vehicle on the internet with protections like popup blockers, cookie cleansing and other privacy tools. The road itself is plagued with bandits and thieves just as there are, to a much lessor degree, car-jackers on the physical roadways. The transfer of financial and personally identifiable data is going to be protected by secure encrypted transmissions from the end user browser to the website on all but the most inadequate of locations, just as your car has functioning locks on the doors in all but the most inadequate of clunkers. (One method of determining if data is being transferred securely is to see if the protocol definition in the leftmost portion of the address bar reads “https” (Hyper Text Transfer Protocol Secure) instead of the usual “http.”)
“User Generated” data, all the data supplied in the way of commentary, opinion, ratings, referrals, discounts, etc., are analyzed to reveal more about the product purchaser than they would likely care to reveal themselves, but is stored locally to the company from which the purchase was made. Retail companies spend valuable time and resources acquiring and processing this information to make them more competitive in the marketplace, they aren’t going to want it becoming available to competitors and risk losing their advantage.
A case in point illustrating the value of end user data and the “property” mentality of companies collecting this information is the Borders book store bankruptcy auction. When Borders went bankrupt in 2011 the competing company Barnes & Noble purchased, among other assets, the end user data being held by Borders. This purchase demonstrates the value of this data, the classification of this data as company owned transferrable property. Barnes & Noble having to buy the data is evidence of the data having not been shared. Obviously Barnes & Noble wouldn’t have been buying data if they had previous access to it otherwise, paid or not.
Now that we’ve purchased this great sweatshirt let’s go ahead and tell our 130 friends (the current average number of friends a person has on facebook2) about it.
Having already discussed the details for the browser’s role, what IP addresses are and how they relate to the domain name we’ll make this trip to our friends much shorter and dispense with those details. Going out to my favorite free social networking site I log in and type up a bit of a commentary on how much I like the sweatshirt and where I plan to wear it first. Then I send out a tweet on Twitter. Finally, I fire off an email from my free email service to Uncle Harvey because the sweatshirt looks just like the one he wears in my fondest childhood memories. The primary difference between the first trip taken to buy the sweatshirt and the several we took here to talk about the sweatshirt is in where the data is stored. In both cases the end user data is stored in a central repository, meaning a repository storing the information of many different users. The information related to the actual sweatshirt purchase was stored in a centralized private database meaning that while still a company database; the data stored remains an intimate conversation between the company and myself, much like actually going to a store. In the cases of the trips I made to the I-Way to tell my friends and family about my purchase, the data is stored in centralized public repositories, meaning that the information is used publicly by anyone paying for access to it. While the concept of who “owns” the user generated data is debated heavily, there is no argument it becomes a core resource of the company storing the data, is processed to a degree over which the end user has no control and is resold as a commodity. Whereas the private database owner has a vested interest in keeping the user data they store to themselves, the public data repository must sell the user data they store to stay in business because someone has to pay for the services the end user is getting for free.
There are ways I could have passed along all the same communications about my great new sweatshirt and kept the information private between myself and those I communicated with, just like with the described purchase. But before we can change our route on the I-Way we need to first better understand where we are now and the landscape around us.


1 “Mozilla Firefox Web Browser — Firefox Features.” http://www.mozilla.org/en-US/firefox/features/ (Accessed October 3, 2011).
2 “Statistics | Facebook.” http://www.facebook.com/press/info.php?statistics (Accessed October 5, 2011).