Category: Technology (page 1 of 3)

Minimizing Your Trust Footprint

I originally published the following last year as an article in The Recompiler, a magazine focused on technology through the lens of feminism. I present it here with as few modifications from its original publication as possible.


For everyone who chooses to engage with the Internet, it poses a conflict between convenience and control of our identities and of our data. However trivially we interact with online services—playing games, finding movies or music, connecting to others on social media—we leave identifying information behind, intentionally or not. In addition, we relinquish some or even all rights to our own creations when we offer our content to share with others, such as whenever we write on Medium.

Most of us give this incongruity some cursory thought—even if we don’t frame it as a conflict—such as by when we set our privacy settings on Facebook. With major data breaches (of identifying, health, financial, or personal info) and revelations of widespread, indiscriminate government surveillance in the news over the last few years, probably more of us are thinking about it these days. In some way or another, we all must face up to the issue.

At one extreme, it’s possible to embrace convenience completely. Doing so means handing over information about ourselves without regard for how it will be used or by whom. At the other extreme, there’s a Unabomber-like strategy of complete disconnection. This form of non-participation comes along with considerable economic and social disenfranchisement.

The rest of us stride a line between, maybe hewing nearer to one extreme or another as our circumstances allow. This includes me—and as time passes, I am usually trying to exert more control over my online life, but I still trade off for convenience or access. I use an idea I call my trust footprint to make this decision on a case-by-case basis.

For example, I realized I began to distrust Google because the core of their business model is based on advertising. I wrote a short post on my personal website about my motives and process, but to sum up, I didn’t want to be beholden to a collection of services that made no promises about my privacy or their functionality or availability in the future. I felt powerless using Google, and I knew this wouldn’t change because they have built their empire on advertising, a business model which puts the customers’ privacy and autonomy at odds with their success.

Before I began to distrust Google, I didn’t give my online privacy or autonomy as much thought as I do today. When I began getting rid of my Google account and trying to find ways to replace its functionality, I had to examine my motives, in order to clarify the intangible problem Google posed for me.

I concluded that companies which derive their income from advertising necessarily pit themselves adversarially against their customers in a zero-sum game to control those customers’ personal information. So I try to avoid companies whose success is based on selling the customer instead of a product.

Facebook, as another example, needs to learn more about their users and the connections between them in order to charge advertisers more and, in turn, increase revenue. To do so, they encourage staying in their ecosystem with games and attempt to increase connections among users with suggestions and groups. As noted in this story about Facebook by The Consumerist last year:

Targeted ads are about being able to charge a premium to advertisers who want to know exactly who they’re reaching. Unfortunately, in order to do so, Facebook has to compromise the privacy of its hundreds of millions of users.

Most social networks engage in similar practices, like Twitter.

Consequently, my first consideration when gauging my trust footprint is to ask who benefits from my business: What motivates them to engage with users, and what will motivate them in the future? This includes thinking about the business model under which online services I choose operate—to the extent this information is available and accurate, of course.

Of course, this information often isn’t clear, up front, available, or permanent, so it’s really a lot of guessing. The “trust” part is quite literal—I don’t actually know what’s going to happen or if my information will eventually be leaked, abused, or sold. Some reading and research can inform my guesses, but they remain guesses. I don’t trust blindly, but it is still something of an act of faith.

It’s for that reason my goal isn’t to completely avoid online services or only use those who are fully and radically transparent. I only want to minimize the risk I take with my information, to reduce the scale of the information I provide, and to limit my exposure to events I can’t control.

The second consideration I make in keeping my trust footprint in check is to question whether a decision I make actually enlarges it. For instance, when I needed a new calendaring service after leaving Google, I realized that I could use iCloud to house and sync my information because I had already exposed personal information to iCloud. I didn’t have to sign up for a new account anywhere, so my trust footprint wasn’t affected.

The tricky part about that last consideration is that online services have tendrils that themselves creep into yet more services. In the case of Dropbox, which provides file storage and synchronization, they essentially resell Amazon’s Simple Storage Service (AWS S3), so if you don’t trust Amazon or otherwise wish to boycott them, then avoiding Dropbox comes along in the bargain. The same goes for a raft of other services, like Netflix and Reddit, who all use Amazon Web Services to drive their technology.

That means it’s not just home users who are storing their backups and music on servers they don’t control. Whether you call it software-as-a-service or just the “cloud,” services have become interconnected in increasingly techological and political ways.

It doesn’t end with only outsourcing the services themselves. All these online activities generate vast amounts of data which must be refined into information—for which there is copious value, even for things as innocuous as who’s watching what on TV. Nielsen’s business model of asking what customers are watching has already become outdated. Nowadays, the media companies know what you watch; the box you used to get the content has dutifully reported it back, and in turn, they’ve handed that data over to another company altogether to mine it for useful information. This sort of media analytics has become an industry in its own right.

As time passes, it will become harder to avoid interacting with unknown services. Economies of scale have caused tech stacks to trend more and more toward centralization. It makes sense for companies because, if Amazon controls all their storage, as an example, then storage becomes wholly Amazon’s problem, and they can offer it even more cheaply than companies which go out and build their own reliable storage.

Centralization doesn’t have to be bad, of course. It’s enabled companies to spring up which may not have been viable in the past. For example, Simple is an online bank which started from the realization that to get started with an entirely new online bank, “pretty much all you need is a license from the Fed and a few computers.”

The upshot is that the process of managing your online life to be entirely within your control becomes increasingly fraught as centralization proceeds. When you back up to “the cloud,” try to imagine whether your information is sitting on a hard disk drive in northern Virginia, or maybe a high-density tape in the Oregon countryside.

It’s not even necessary to go online yourself to interact with these business-to-business services. Small businesses have always relied upon vendors for components of their business they simply can’t provide on their own, and those vendors have learned they can resell other bulk services in turn. The next time you see the doctor, ask yourself, into which CRM system did your doctor just input your health information? Where did the CRM store that information? Maybe in some cosmic coincidence, it’s sitting alongside your backups on the same disk somewhere in a warehouse. Probably not, but it could happen.

My trust footprint, just like my carbon footprint, is a fuzzy but useful idea for me, which acknowledges that participation in the online world carries inevitable risk—or at least an inevitable cost. It helps me gauge whether I’m closer or further away from my ideal privacy goals. And just the same way that we can’t all become carbon neutral overnight without destroying the global economy, it’s not practical to run around telling everyone to unplug or boycott all online services.

Next time you’re filling out yet another form online, opening yet another service, trying out one more new thing, remember that you’re also relinquishing a little control over what you create and even a small part of who you are. And if this thought at all gives you pause, see if there’s anything you can do to reduce your trust footprint a little. Maybe you can look into hosting your own blog for your writing, getting network-attached storage for your home instead of using a cloud service, limiting what you disclose on social media, or investing in technology that takes privacy seriously.

Beginning with Regular Expressions

I originally published the following last year as an article in The Recompiler, a magazine focused on technology through the lens of feminism. It began as a primer on picking up regular expressions for a friend who was learning to program at the time. I regarded it as an exercise in making a complex topic as accessible as possible.

It assumes an audience familiar with general computer concepts (such as editing text), but it does not necessarily assume a programming background. I present it here with as few modifications from its original publication as possible.


Regular expressions are short pieces of text (often I’ll call a single piece of text a “string,” interchangeably) which describe patterns in text. These patterns can be used to identify parts of a larger text which conform to them. When this happens, the identified part is said to match the pattern. In this way, unknown text can be scanned for patterns, ranging from very simple (a letter or number) to quite complex (URLs, e-mail addresses, phone numbers, and so on).

The patterns shine in situations where you’re not precisely sure what you’re looking for or where to find it. For this reason, regular expressions are a feature common to many technical programs which focus on using lots of text. Most programming languages also incorporate them as a feature.

One common application of regular expressions is to move through a body of text to the first part which matches a pattern—in other words, to find something. It’s then possible to build on this search capability then to replace a pattern automatically. Another use is to validate text, determining whether it conforms to a pattern and acting accordingly. Finally, you (or your program) may only care about text which matches a pattern, and all other text is irrelevant noise. With regular expressions, you can cull a large text down to something easier to use, more meaningful, or suitable to further manipulation.

A Simple First Regular Expression

A regular expression, like I said, is itself a short piece of text. Often, it’s written in a special way to set it apart as a regular expression as opposed to normal text, usually by surrounding it with slashes. Whenever I write a regular expression in this post, I will also surround it with slashes on both sides. For example, /a/ is a valid regular expression which matches the string a. That particular expression could be used to find the first occurrence of the letter a in a longer string of text, such as, Where is the cat?. If the pattern /a/ were applied against that sentence, it would match the a in the middle of cat.

There’s a clear benefit to using regular expressions to do pattern matching in text. They let you ask for what you want rather than specifying how to find it. To be technical, we’d say that regular expressions are a kind of declarative syntax; contrast that with an imperative method of asking for the same thing. In this case, to do this in an imperative way, you’d have to write instructions to loop through each letter in the text, comparing it to the letter a. In the case of regular expressions, the how isn’t our problem. We’re left simply stating the pattern and letting the computer figure it out.

Regular expressions are rather rigid and will only do what you say, sometimes with surprising results. For example, /a/ only matches a single occurrence of a, never A, nor à, and will only match the first one. If it were applied to the phrase “At the car wash”, it would match against the first a in car. It would skip over the A at the beginning, and it would stop looking before even seeing the word wash.

As rigid as regular expressions are, they have an elaborate syntax which can describe vast varieties of patterns. It’s possible to create patterns which can look for entire words, multiple occurrences of words, words which only happen in certain places, optional words, and so on. It’s a question of learning the syntax.

While I intend to touch on the various features which allow flexible and useful patterns, I won’t exhaust all the options here, and I recommend consulting a syntax reference once the idea feels solid. (Before getting into some of the common features of regular expression syntax, it’s important to note that regular expressions vary from implementation to implementation. The idea has been around a long time and has been incorporated into countless programs, each in slightly different ways, and there have been multiple attempts to standardize them. Despite the confusion, though, there is a lot of middle ground. I’m going to try to stay firmly on this middle ground.)

Metacharacters

Let’s elaborate a bit on our first pattern. Suppose we’re not sure what we’re looking for, only that we know it begins with a c and ends with a t. Let’s think about what kinds of words we might want to match, so we can talk intelligently about what patterns exist in those words. We know that /a/ matches cat. What if we want to match cut instead? We could just use /u/, but we know this also matches unrelated strings, like bun or ambiguous.

Now, /cat/ is a perfectly reasonable pattern, and so is /cut/, but we’d probably have an easier go if we create a single pattern that says we expect the letter c, some other letter we don’t care about, and then the letter t. Regular expressions let us use metacharacters to describe the kinds of letters, numbers, or other symbols we might expect to find without naming them directly. (“Character” is a useful word to encompass letters, numbers, spaces, punctuation, and other symbols—anything that makes up part of a string—so “metacharacter” is a character describing other characters.) In this case, we’ll use a .—a simple dot. In regular expression patterns, a dot metacharacter matches any individual character whatsoever. Our regular expression now looks like /c.t/ and matches cat, cut, and cot, among other things.

In fact, we might describe metacharacters as being any character which does not have its literal meaning, and so regular expressions may contain either characters and metacharacters. Occasionally, it can be confusing to know which is which. Mostly, it will be necessary to consult a reference for regular expressions which best suits your situation. Sometimes, even more confusingly, we want to use a metacharacter as a character, or vice versa. In that situation, we need to escape the character.

Escaping

We can see in the above example that a dot has a special meaning in a regular expression. Sometimes, though, we might wish to describe a literal dot in a pattern. For this reason, we need a way to describe literal characters which don’t carry their ordinary meaning, as well as employ ordinary characters for new meanings. In a regular expression pattern (as in many other programming languages), a backslash (\\) does this job. Specifically, it means that the character directly after it should not be interpreted as usual.

Most often, it can be used to define a pattern containing a special character as an ordinary one. In this context, the backslash is said to be an escape character, which lets us write a character while escaping its usual meaning.

For example, suppose we cared about situations where a sentence ends in the letter t. The easiest pattern to describe that situation might be the letter, followed by a period and a space, but we can’t type a literal dot for that period, or else we’d match words like to. Therefore, our pattern must escape the dot. The pattern we want is written as /t\. /.

Quantifiers

Metacharacters may do more than stand in for another kind of character. They may modify the meaning of characters after it (as we’ve already seen with the escape metacharacter) or those before it. They may also stand in for more abstract concepts, such as word boundaries.

Let’s first consider a new situation, using a metacharacter to modify the preceding character. Think back to earlier, when we said we know we want something that begins with a c and ends with a t. Using the pattern /c.t/, we already know that we can match words like cut and cat.

We need a few more special metacharacters, though, before our expression meets our requirements. /c.t/ won’t match, for example, carrot, but it will match concatenate and subcutaneous.

First of all, we need to be able to describe a pattern that basically leaves the number of characters in the middle flexible. Quantifiers allow us to describe how many occurrences of the preceding character we may match. We can say if we expect zero or more, one or more, or even a very particular count of a character or larger expression.

Such patterns become far more versatile in practice. Take, for example, the quantifier +. It lets us specify that the character just before it may occur one or more times, but it doesn’t name an upper limit.

Remember the pattern we wrote to match sentences ending in t? What if we wanted to make sure we matched all the spaces which may come after the sentence? Some writers like to space twice between sentences, after all. In that case, our pattern could look like /t\. +/. This pattern describes a situation in which the letter t is followed by a literal dot and then any number of spaces.

Quantifiers may also modify metacharacters, which make them truly powerful and very useful. Using the + again, let’s insert it into our /c.t/ pattern to modify the dot metacharacter, giving us /c.+t/. Now we can match “carrot”! In fact, this pattern matches a c followed by any number of any character at all, as long as a t occurs sometime later on.

There are a few other quantifiers needed to cover all the bases. The following three quantifiers cover the vast majority of circumstances, in which you’re not particularly sure what number of characters you intend to match:

  • * matches zero or more times
  • + matches one or more times
  • ? matches exactly once or zero times

On the other hand, you may have a better idea about the minimum or maximum number of times you need to match, and the following expressions can be used as quantifiers as well.

  • {n} matches exactly n times
  • {n,} matches at least n or more times
  • {n,m} matches at least n but not more than m times

Anchors

We still have “concatenate” and “subcutaneous” to deal with, though. /c.+t/ matches those because it doesn’t care about what comes before or after the match. One strategy we can use is to anchor the beginning or end of the pattern to stipulate we want the text to begin or end there. This is a case where a metacharacter matches a more abstract concept.

Anchors, in this case, let us match the concept of the beginning or the end of a string. (Anchors really refer to the beginning and ends of lines, most of the time, but it comes to the same thing in this case. See a reference guide for more information on this point.) The ^ anchor, which may only begin a pattern, matches the beginning of a string. Likewise, a $ at the end means the text being matched must end there. Using both of these, our pattern becomes /^c.+t$/.

To break this pattern down, we’re matching a string which begins with a c, followed by some indeterminate number of characters, and finally ends with a t. As ^ and $ represent the very beginning and end of the string, we know that we won’t match any string containing anything at all on the line other than the pattern.

Character Classes

Using anchors, though, may not be the best solution. It assumes the string we’re searching within may only contain the pattern we’re looking for, and so often, this is not the case.

The dot is a very powerful metacharacter. Its biggest flaw is that it is too flexible. For example, /^c.+t$/ would match a string such as cat butt. Patterns try to match as much as possible. Some regular expression implementations allow you to specify a non-greedy pattern (which I won’t cover here—see a reference), but a better approach is to revisit our requirements and reword them slightly to be more explicit.

We want to match a single word (some combination of letters, unbroken by anything that’s not a letter) which begins with c and ends with t. Considering this in terms of the kinds of characters which may come before, during, and after the match, we want to match something which contains not-alphabetical characters before it, followed by the letter c, then some other alphabetical letters, then the letter t, and then something else that’s not alphabetical.

In the /^c.+t$/ pattern, we need to replace both of the anchors and the middle metacharacter .. Assuming words come surrounded by spaces, we can replace each anchor with just a space. Our pattern now looks like / c.+t /.

Now, as for the dot, we can use a character class instead. Character classes begin and end with a bracket. Anything between is treated as a list of possibilities for the character it may match. For example, /[abc]/ matches a single character which may be either a, b, or c. Ranges are also acceptable. /[0-9]/ matches any single-digit number.

We can use a range which captures the whole alphabet, and luckily, a character class is considered a single character in the context of a pattern, so the quantifier after refers to any character in the class. Putting all this together, we end up with the pattern / c[a-z]+t /.

If we want to mix up upper- and lower-case letters, character classes help in this situation, too: / [Cc][a-z]+t /. Now we can match on names like Curt.

Our assumption that words will be surrounded by spaces is a fragile one. It falls apart if the word we want to match is at the very beginning or end, or if it’s surrounded by quotation marks or other punctuation. Luckily, character classes may also list what they do not include by beginning the list with a ^. When ^ comes within brackets, instead of at the beginning of a pattern, instead of serving as an anchor, it inverts the meaning of the character class.

If we consider a word to be a grouping of alphabetical characters, then anything that’s around the word would be anything that’s not alphabetical. Let’s adjust our pattern accordingly: /[^A-Za-z0-9][Cc][a-z]+t[^A-Za-z0-9]/. We’re using the same pattern as before, but the beginning and ending space have become [^A-Za-z0-9].

Escape Sequences

If our pattern is starting to look cumbersome and odd to you, you’re not alone in thinking that. There’s absolutely nothing wrong with the pattern we just wrote, but it has gotten a bit long-winded. This makes it difficult to read, write, and later update.

In fact, many character classes get used so often (and can otherwise be so annoying to write repeatedly) that they’re usually also available as backslashed sequences, such as \b or \w. (This is escaping, again, as I mentioned before, but instead of escaping a special character’s meaning, we’re escaping these letters’ literal meaning. In other words, we’re imbuing them with a new meaning.)

The availability and specific meaning of these escape sequences vary a bit from situation to situation, so it’s important to consult a reference. That said, in our case, we only need a couple which tend to be very common to find.

One of the very most common such escape sequences is the \w which stands in for any “word” character. For our purposes, it matches any alphanumeric character. This is good enough for the inside of a word, so we can revisit our pattern and turn it into /[^\w][Cc]\w+t[^\w]/. Our pattern reads a little more logically now: We’re searching for one not-word character (like punctuation or whitespace) followed by an upper- or lower-case c, some indefinite count of word characters, the letter t, and then finally one not-word character.

Notice how I used the escape sequence inside the character classes at the beginning and end of the word. This is perfectly valid and sometimes desirable. For example, it would allow us to combine escape sequences for which there’s no single suitable one.

It also lets us invert their meaning, as you saw in the most recent example, but many escape sequences can be modified in the same way by capitalizing them, such as \W. As a mnemonic to remember this trick, think of it as shifting the escape sequence (using shift to type it). In cases where a character class may be inverted in meaning, often a capitalized counterpart exists.

Using \W, now we can pare down the pattern back to something a little more readable: /\W[Cc]\w+t\W/.

More Reading

For today, I’m satisfied with our pattern. In a string like I would like some carrot cake., it matches carrot with no trouble, but it doesn’t match cake or even subcutaneous tissue.

There are many more ways to improve it, though. We’ve only laid the groundwork for understanding more of the advanced concepts of regular expressions, many of which could help us make our expression even more powerful and readable, such as pattern qualifiers and zero-width assertions.

Concepts like grouping allow you to break up and manipulate matches in fine-grained ways. Backtracking and extended patterns allow patterns to make decisions based on what they’ve already seen or will see. Some programmers have even written entire programs based on regular expressions, only using patterns!

In short, regular expressions are a deep and powerful topic that very few programmers completely master every corner of. Don’t be afraid to keep a reference close at hand—hopefully it will now empower you instead of daunt you, now that you have a grasp of how to get started composing patterns.

In the Back of the House

I got my first job at fifteen, going on sixteen. I worked for my hometown newspaper as an inserter, and as time passed, I began filling in occasionally as a “pressman.” Inserters were a collective bunch of old ladies (and me) who made spare money assembling the newspaper sections and stuffing in the ad inserts. When I got to help with the actual printing, it took the form of developing, treating, and bending the lithographic plates in preparation for printing. More often, I caught the papers as they rolled off the press to bundle them up for distribution. I also cleaned up, sweeping and trash takeout and the like, but I wasn’t good at it. I liked to take breaks to play my guitar at the back of the shop, so I think the editor-in-chief who ran things probably was annoyed as piss at me half the time.

There was no question I worked in the bowels of the operation. The real fun (and to the extent a small, rural paper could afford it, the real money) happened at the front of the building where the editor-in-chief and reporters worked. I passed through to gather up trash a few times a week. As I went, I admired the editor-in-chief’s ancient typewriter collection in his office. I enjoyed talking to the lead reporter, who loved Star Trek. The layout team’s work fascinated me, especially as they transitioned to digital layout from cutting and splicing pieces of paper together.

After my tour, I returned to the back, and I only heard from the front when it was time to go to press or when we had to stop the presses. We weren’t a separate world by any means, but we had a job to do, and that job was entirely a pragmatic one, keeping the machinery running and enabling the actual enterprise which paid us. Inasmuch as I felt like an important part of the whole, it was in a sense of responsibility toward the final product.

About a decade later, I stumbled across my current programming thing. Now I find myself at the back of the house again. The work echoes my first job sometimes—working on the machinery, keeping things running, along with other programmers and operations folks. This time the job comes with a dose of values dissonance for me. It feels like a wildly inverted amount of prestige goes to us, to the people running the machines, instead of the others who are closer to the actual creation (and the customers using it).

I’m not sure our perceived value is unwarranted—programming is hard. I’m more concerned about the relationship between the front and back of the house. It could be that we, as programmers and tech people, undervalue the people making the content and interacting with the customers. I see the skewed relationship when I look at inflated tech salaries. It makes itself evident in startups made up of all or mostly engineers. I felt it most acutely when I considered becoming a tech writer, only to be reminded it could derail my career and cost me monetarily.

I don’t think my observation comes with a cogent point. Maybe only that tech can’t be just about the engineering, no more than a newspaper can be only a printing press.

Functional Programming for Everyone Else

Functional programming has become a hot topic in the last few years for programmers, but non-programmers might not understand what we’re talking about. I can imagine them wondering, “Did programs not work before? Why are they suddenly ‘functional’ now?”

Earlier today, I was tweeting a bit about the challenge of explaining to family or primary school students what the big deal is all about. Even we programmers take some time to cotton on to the notion. I know I’d have to pause for a moment if someone asked me what Haskell offers over Python.

If you’re not a programmer and have been wondering what the big deal is, here’s my attempt to lay it out.


First, consider the time just before computers existed. By the 1920s, a kind of math problem called a decision problem led various people to learn how to investigate the problem solving process itself. To do this, we had to invent an idea we call computability, meaning to automate problem solving using small, reusable pieces. A couple of mathematicians tackled this idea in two different ways (though they each came to the same conclusion), and today we have two ways to think about computability as a result.

I’m partial to Alan Turing’s approach because it’s very practical. He envisioned a process that’s quite mechanical. In fact, we now call it a Turing machine, even though his machine never actually existed. It was more of a mental exercise than something he intended to build.

To solve a problem with a Turing machine, he would break a problem into a sequence of steps which would pass through the machine on an endless tape. The machine itself knew how to understand the steps on the tape, and it knew how to store information in its memory. As the steps on the tape passed through, one at a time, the machine would consult the tape and its own memory to figure out what to do. This usually meant modifying something in its memory, which in turn could affect the following step, over and over until the steps ran out. By choosing the right set of steps, when you were done, the machine’s memory would end up with the answer you needed.

Since that time, most computers and programs are based on this concept of stringing together instructions which modify values in memory to arrive at a result. Learning to program means learning a vast number of details, but much of it boils down to understanding how to break a problem into instructions to accomplish the same thing. Programs made this way would not be considered “functional.”

At the same time, another mathematician, Alonzo Church, came up with another approach called lambda calculus. At its heart, it has a lot in common with Turing’s approach: lambda calculus breaks up a problem into small parts called functions. Instead of modifying things in memory, though, the key proposition of a function is that it takes input and calculates a result—nothing more. To solve a problem this way, little functions are written to calculate small parts of the problem, which are in turn fed to other functions which do something else, and so on until you get an answer.

Lambda calculus takes a much more abstract approach, so it took longer to work out how to make programs with it. When we did, we called these programs “functional programs” because functions were so fundamental to how they worked.


Putting all this together, I think of functional programs as ones which do their jobs without stopping to take notes along the way. As a practical consequence, this implies a few odd things. The little niceties that come first nature to procedural programs—like storing values, printing out text, or doing more than one thing at once—don’t come easy to functional programs. On the other hand, functional programs allow for understanding better what a program will do, since it will do the same thing every time if its input doesn’t change.

I think both approaches have something to offer, and in fact, most programs are made with a combination of these ideas. Turing proved neither approach was better than the other. They’re just two ways of ending up at the same result. Programmers each have to decide for themselves which approach suits best—and that decision problem can’t be solved by a program yet.

A Gentle Primer on Reverse Engineering

Over the weekend at Women Who Hack I gave a short demonstration on reverse engineering. I wanted to show how “cracking” works, to give a better understanding of how programs work once they’re compiled. It also serves my abiding interest in processors and other low-level stuff from the 80s.

My goal was to write a program which accepts a password and outputs whether the password is correct or not. Then I would compile the program to binary form (the way in which most programs are distributed) and attempt to alter the compiled program to accept any password. I did the demonstration on OS X, but the entire process uses open source tools from beginning to end, so you can easily do this on Windows (in an environment like Cygwin) or on Linux. If you want to follow along at home, I’m assuming an audience familiar with programming, in some form or another, but not much else.

Building a Program

I opened a terminal window and fired up my text editor (Vim) to create a new file called program.c. I wanted to write something that would be easy to understand, edit, and yet still could be compiled, so C seemed like a fine choice. My program wasn’t doing anything that would’ve been strange in the year 1982.

First, I wrote a function for validating a password.

int is_valid(const char* password)
{
    if (strcmp(password, "poop") == 0) {
        return 1;
    } else {
        return 0;
    }
}

This function accepts a string and returns a 1 if the string is “poop” and 0 otherwise. I’ve chosen to call it is_valid to make it easier to find later. You’ll understand what I mean a few sections down.

Now we need a bit of code to accept a string as input and call is_valid on it.

int main()
{
    char* input = NULL;
    input = malloc(256);
    printf("Please input a word: ");
    scanf("%s", input);

    if (is_valid(input)) {
        printf("That's correct!\n");
    } else {
        printf("That's not correct!\n");
    }

    free(input);
    return 0;
}

This source code is likewise pretty standard. It prompts the user to type in a string and reads it in to a variable called input. Once that’s done, it calls is_valid with that string. Depending on the result, it either prints “That’s correct!” or “That’s not correct!” and exits, returning control to the operating system. With a couple of “include” directives at the top, this is a fully functioning program.

Let’s build it! I saved the file program.c and used the command gcc program.c -o program to build it.

This outputs a file in the current directory called program which can be executed directly. Let’s run our program by typing ./program. It’ll ask us to put in a word to check. We already know what to put in (“poop”), so let’s do that and make sure we see the result we expect.

Please input a word: poop
That's correct!

And if we run it again and type in the wrong word, we get the other possible result.

Please input a word: butts
That's not correct!

So far, so good.

A Deeper Look

There’s nothing special about this program that makes it different than your web browser or photo editor; it’s just a lot simpler. I can demonstrate this on my system with the file command. Trying it first on the program I just built, with the command file program, I see:

program: Mach-O 64-bit executable x86_64

This is the file format OS X uses to store programs. If this kind of file seems unfamiliar, the reason is that most applications are distributed as app bundles which are essentially folders holding the executable program itself and some ancillary resources. Again, with file, we can see this directly by running file /Applications/Safari.app/Contents/MacOS/Safari:

/Applications/Safari.app/Contents/MacOS/Safari: Mach-O 64-bit executable x86_64

Let’s learn a little more about the binary we just built. We can’t open it in a text editor, or else we get garbage. Using a program called hexdump we can see the raw binary information (translated to hexadecimal) contained in the file. Let’s get a glimpse with hexdump -C program | head -n 20.

00000000  cf fa ed fe 07 00 00 01  03 00 00 80 02 00 00 00  |................|
00000010  10 00 00 00 10 05 00 00  85 00 20 00 00 00 00 00  |.......... .....|
00000020  19 00 00 00 48 00 00 00  5f 5f 50 41 47 45 5a 45  |....H...__PAGEZE|
00000030  52 4f 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |RO..............|
00000040  00 00 00 00 01 00 00 00  00 00 00 00 00 00 00 00  |................|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 00 00 00 00 00 00  19 00 00 00 28 02 00 00  |............(...|
00000070  5f 5f 54 45 58 54 00 00  00 00 00 00 00 00 00 00  |__TEXT..........|
00000080  00 00 00 00 01 00 00 00  00 10 00 00 00 00 00 00  |................|
00000090  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
000000a0  07 00 00 00 05 00 00 00  06 00 00 00 00 00 00 00  |................|
000000b0  5f 5f 74 65 78 74 00 00  00 00 00 00 00 00 00 00  |__text..........|
000000c0  5f 5f 54 45 58 54 00 00  00 00 00 00 00 00 00 00  |__TEXT..........|
000000d0  10 0e 00 00 01 00 00 00  e7 00 00 00 00 00 00 00  |................|
000000e0  10 0e 00 00 04 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 04 00 80 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000100  5f 5f 73 74 75 62 73 00  00 00 00 00 00 00 00 00  |__stubs.........|
00000110  5f 5f 54 45 58 54 00 00  00 00 00 00 00 00 00 00  |__TEXT..........|
00000120  f8 0e 00 00 01 00 00 00  1e 00 00 00 00 00 00 00  |................|
00000130  f8 0e 00 00 01 00 00 00  00 00 00 00 00 00 00 00  |................|

The left column is the “offset,” in hexadecimal (like line numbering, it tells us how many bytes into the file we are on a particular line). The middle two columns are the actual contents of the file itself, again in hexadecimal. The right column shows an ASCII equivalent for the file’s contents, where possible. If you pipe the file’s contents to less you can scan through and see mostly a lot of garbage and also a few familiar strings. If you’re interested in knowing what pieces of text are embedded in a file, the program strings speeds this process up a great deal. In our case, it tells us:

poop
Please input a word:
That's correct!
That's not correct!

So clearly those strings are still floating around in the file. What’s the rest of this stuff? Volumes of documentation exist out there on the Mach-O file format, but I don’t want to bog down in the details. I have to level with you here—I honestly don’t actually know much about it. Analogizing from other executable formats I’ve seen before, I know there’s probably a header of some kind that helps the operating system know what kind of file this is and points out how the rest of the file is laid out. The rest of the file, incidentally, is made up of sections which may contain any of a number of things, including data (the strings in this case) built into the program; information on how to find code called from elsewhere in the system (imports, like our printf and strcmp functions, among others); and executable machine code.

Disassembling the Program

It’s the machine code we’re interested in now. This is the interesting part! Machine code is binary data, a long string of numbers which correspond to instructions the processor understands. When we run our program, the operating system looks at the file, lays it out in memory, finds the entry point, and starts feeding those instructions directly to the processor.

If you’re used to scripted programming languages, this concept might seem a little odd, but it bears on what we’re about to do to our binary. There’s no interpreter going over things, checking stuff, making sure it makes sense, throwing exceptions for errors and ensuring they get handled. These instructions go right into the processor, and being a physical machine, it has no choice but to accept them and execute each one. This knowledge is very empowering because we have the final say over what these instructions are.

As you may know, the compiler gcc translated my source code I wrote earlier into machine language (and packaged it nicely in an executable file). This allows the operating system to execute it directly, but as another important consequence of this process, we also no longer need the source code. Most of the programs you run likely came as binary executables without source code at all. Others may have source code available, but they’re distributed in binary form.

Whatever the case, let’s imagine I lost the source code to program up above and can’t remember it. Let’s also imagine I can’t even remember the password, and now my program holds hostage important secrets.

You might think I could run the binary through the strings utility, hoping the password gets printed out, and in this case, you’d be on the right track. Imagine if the program didn’t have a single password built in and only accepted passwords whose letters were in alphabetical order or added up (in binary) a specific way. Without the source code, I couldn’t scan to see which strings seem interesting, and I wouldn’t have a clue what to type in.

But we don’t need to lose heart because we already know that the program contains machine code, and since this machine code is meant to be fed directly to the processor, there’s no chance it’s been obfuscated or otherwise hidden. It’s there, and it can’t hide. If we knew how to read the machine code, there would be no need for the source code.

Machine code is hard for a human to read. There’s a nice GNU utility called objdump which helps enormously in this respect. We’ll use it to disassemble the binary. This process is called “disassembly” instead of “decompilation” because we can’t get back the original source code; instead we can recover the names of the instructions encoded in machine code. It’s not ideal, but we’ll have to do our best. (Many people use a debugger to do this job, and there’s a ton of benefits to doing so, like being able to watch instructions execute step by step, inspect values in memory, and so on, but a disassembly listing is simpler and less abstract.)

I looked up the documentation for gobjdump (as it’s called on my system) and picked out some options that made sense for my purposes. I ended up running gobjdump -S -l -C -F -t -w program | less to get the disassembly. This is probably more than we’d care to know about our program’s binary, much of it mysterious to me, but there’s some very useful information here too.

The Disassembly

I’ll share at least what I can make of the disassembly. At the top of the listing is some general information. This symbol table is interesting. We can see the names of the functions I defined. If I had truly never seen the source code, I would at this point take an especial amount of interest in a function called is_valid, wouldn’t I?

Immediately below this is a “Disassembly of section .text”. I happen to know from past experience that the “.text” bit is a bit misleading for historical reasons; a “.text” section actually contains machine code! The leftmost column contains offsets (the place in the file where each instruction begins). The next column is the binary instructions themselves, represented in hexadecimal. After that are the names and parameters of each instruction (sometimes with a helpful little annotation left by objdump).

Of course, the very first thing I see is the instructions of the is_valid function.

Disassembly of section .text:

0000000100000e10  (File Offset: 0xe10):
   100000e10:   55                      push   %rbp
   100000e11:   48 89 e5                mov    %rsp,%rbp
   100000e14:   48 83 ec 10             sub    $0x10,%rsp
   100000e18:   48 89 7d f0             mov    %rdi,-0x10(%rbp)
   100000e1c:   48 8b 7d f0             mov    -0x10(%rbp),%rdi
   100000e20:   48 8d 35 33 01 00 00    lea    0x133(%rip),%rsi        # 100000f5a <strcmp$stub+0x4a> (File Offset: 0xf5a)
   100000e27:   e8 e4 00 00 00          callq  100000f10 <strcmp$stub> (File Offset: 0xf10)
   100000e2c:   3d 00 00 00 00          cmp    $0x0,%eax
   100000e31:   0f 85 0c 00 00 00       jne    100000e43 <is_valid+0x33> (File Offset: 0xe43)
   100000e37:   c7 45 fc 01 00 00 00    movl   $0x1,-0x4(%rbp)
   100000e3e:   e9 07 00 00 00          jmpq   100000e4a <is_valid+0x3a> (File Offset: 0xe4a)
   100000e43:   c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%rbp)
   100000e4a:   8b 45 fc                mov    -0x4(%rbp),%eax
   100000e4d:   48 83 c4 10             add    $0x10,%rsp
   100000e51:   5d                      pop    %rbp
   100000e52:   c3                      retq   
   100000e53:   66 66 66 66 2e 0f 1f 84 00 00 00 00 00  data16 data16 data16 nopw %cs:0x0(%rax,%rax,1)

This is super exciting because we’re about to read assembly language. There are lots of books and sites on this subject, and my own understanding of assembly language is a bit rusty from years of disuse, but I know enough to get the gist. Let’s break it down.

  • The first three instructions (the first three lines, starting with 100000e10) are a preamble that begin most functions in assembly language generated by a compiler. They’re not important for us. (It saves the old frame pointer, gets a new frame pointer, and clears space on the stack for locals.)
  • The next two instructions set up for our strcmp function. This looks a bit odd in assembly language compared to what we’re used to. The mov instructions are shifting data from a location in memory to a register and vice versa. Because registers are involved, the disassembly wasn’t able to hint very well what these values may be, but we can guess it’s moving the strings to compare into place. I know this because of the calling convention for the function call (basically, set up the data and then make the call, which will know where to find the data); because the %rbp is the base register, which usually points to data; and because -0x10(%rbp) is a way of saying “look sixteen bytes earlier in memory than the address in the %rbp register.”
  • The lea and callq instructions load and call the strcmp function using the parameters we just moved in place. That function lives elsewhere in the system, so some magic happens here to transfer control of our program to that function.
  • By the time we reach the cmp instruction, strcmp has done its thing and stored its result in the accumulator register %eax. By convention, return values usually live in %eax, so given that we’re using a cmp (“compare”), and it’s acting on %eax and $0x0 (a zero), it’s a safe bet we’re checking to make sure strcmp returned zero. This instruction has the side effect of setting a flag in the processor called ZF to either 1 or 0, depending on if the comparison is true or not.
  • The next instruction is jne which is short for “jump if not equal.” It checks the ZF flag, and if it’s zero, skips ahead twelve bytes (bypassing any instructions in the intervening space).
  • That’s followed by a movl and a jmpq. These instructions move a 1 into a location in memory and skip ahead another seven bytes. Look at the two-digit hexadecimal numbers to the left of these two instructions. They add up to twelve!
  • Likewise, after these instructions, one other instruction moves the value 0 into the same location of memory and continues ahead. This instruction is exactly seven bytes long. So these jumps accomplish one of either two things: either the memory location -0x4(%rbp) is going to hold a 1 or a 0 by the time we get to the final mov. This is how assembly language does an if—a very interesting detail we’ll return to.
  • That last mov puts the value at -0x4(%rbp) (we just saw it’s either a 1 or a 0) into %eax, which we know is going to be the return value.
  • Finally, the function undoes the work from the preamble and returns. (After that is some junk that’s never executed.)

That was a lengthy explanation, so to sum up, we learned that the binary executable has a function called is_valid, and this function calls strcmp with some values and returns either a 1 or a 0 based on its return value. That’s a pretty accurate picture based on what we know of the source code, so I’m pleased as punch!

Directly below the definition for this function is the main function. It’s longer, but it’s no more complex. It does the same basic tasks of moving values around, calling functions, inspecting the values, and branching based on this. Again, the values are difficult to get insight into because many registers are used, and there’s a bit more setup. For the sake of brevity, I’ll leave analyzing this function as an exercise for the reader (I promise it won’t be on the test).

Breaking the Program

Remember, we don’t have the slightest idea what the password is, and there’s no good indication from the disassembly what it might be. Now that we have a good understanding of how the program works, we stand a good chance of modifying the program so that it believes any password is correct, which is the next best thing.

We can’t modify this disassembly listing itself. It’s output from objdump meant to help us understand the machine code (the stuff in the second column). We have to modify the program file itself by finding and changing those hexadecimal numbers somewhere in the file.

After looking over how both is_valid and main work, there are lots of opportunities to change the flow of the program to get the result we want, but we have to stay within a few rules. Notice how a lot of the instructions specify where other parts of the program are in terms of relative counts of bytes? That means that we can’t change the number of bytes anywhere, or else we’d break all the symbol references, section locations, jumps, offsets, and so on. We also need to put in numbers which are valid processor instructions so that the program doesn’t crash.

If this were your first program, I’d be forced to assume you wouldn’t know what numbers mean what to the processor. Luckily, the disassembly gives us hints on how to attack it. Let’s confine our possibilities (such as changing jump logic or overwriting instructions with dummy instructions) to only those we can exploit by using looking at this disassembly itself. There isn’t a lot of variety here.

To me, one neat thing about is_valid stands out. Two of the lines are extremely similar: movl $0x0,-0x4(%rbp) and movl $0x1,-0x4(%rbp). They do complementing things with the same memory location, use the same number of bytes (seven), involve the same setup, are near one another, and directly set up the return value for is_valid. This says to me the machine code for each instruction would be interchangeable, and by changing one or the other, we can directly change the return value for is_valid to whatever we want. It’s a safe bet, with a function named that, we want it to return a 1, but if we weren’t sure, I could look ahead to the main function and see how its return value gets used later on.

In other words, we want to change movl $0x0,-0x4(%rbp) to be movl $0x1,-0x4(%rbp) so that no matter what, is_valid returns a one. The machine code for the instruction we have is c7 45 fc 00 00 00 00. Conveniently, the machine code for that precise instruction we want is just two lines above: c7 45 fc 01 00 00 00. The last challenge ahead is to find these bytes in the actual file and change them.

Where in the file are these bytes? Note that the listing says “File Offset: 0xe10” for the function is_valid. That’s actually the count of bytes into the file we’d find the first instruction for this function (3648 bytes, in decimal), and the offset in the left column for the first instruction is “100000e10”, so those offsets in the left column look like they tell where in the file each instruction’s machine code is. The instruction we care about is at “100000e43”, so it must be 3651 bytes into the file. We only need to change the fourth byte of the instruction, so we can add four to that count to get 3655 bytes.

Using hexdump -C program | less and scrolling ahead a bit, I find a line like this one:

00000e40  00 00 00 c7 45 fc 00 00  00 00 8b 45 fc 48 83 c4  |....E......E.H..|

Sure enough, there’s the instruction, and the seventh byte on this line is the one we want to change. Patching a binary file from the command line is sort of difficult, but this command should do the trick:

printf '\x01' | dd of=program bs=1 seek=3654 count=1 conv=notrunc

dd is writing to the file program (of=program), seeking by one byte at a time (bs=1), skipping ahead 3654 bytes past the first one to land on 3655 (seek=3654), changing only one byte (count=1), and not truncating the rest of the file (conv=notrunc).

Now I’ll run the program the same way we did before (./program) and see if this worked.

Please input a word: butts
That's correct!

Success!

Conclusions

That’s about it. It’s a contrived example, and I knew it would work out before the end, but this is a great way to start learning how programs are compiled, how processors work, and how software cracking happens. The concepts here also apply themselves to understanding how many security exploits work on a mechanistic level.

Clarity Through Static Typing

I can’t seem to find much discussion online contrasting dynamic and static typing as teaching tools. Others have covered the technical merits up and down, but I wanted to make a case for static typing for teaching new programmers.

It’s true that it’s easier, even necessary, to elide abstract concepts like types when first starting out. Dynamically typed programming languages (like Python, Ruby, or JavaScript) allow learners to get started quickly and see results right away. It’s important not to underestimate the importance of that quick, tight feedback loop! While getting started, students don’t need to know that “Hello World” is skating on layers of abstractions on the way to the screen.

At the risk of veering into criticizing dynamic typing itself (which isn’t my intention!) languages like Ruby and Python unfortunately also lengthen the feedback cycle between making certain kinds of mistakes and seeing an error produced from them. In the worst cases, the error becomes much more difficult to understand when it occurs. Testing becomes crucial to ferret out these kinds of errors.

That’s a relatively minor concern of mine, though. I’m more concerned about what happens when a student turns into a new programmer interacting with a non-trivial system. It’s inevitable that a new programmer will have to learn an existing complex system—if not on the job, then at the least while learning a web framework. At this point, she will have to use or modify some part of the system before understanding the whole. In other words, a new programmer will have to point at a symbol or word on the screen and ask, “What is that?”

In a language like Ruby or Python, it literally takes longer to figure out what a variable on the screen represents, and it sometimes requires searching through many files and holding many abstractions in your head to understand any non-trivial piece of source code. Using or modifying a complex system requires deeper and more expert knowledge of the system. It’s for this reason that I feel static typing helps peel away abstractions. It also makes information about the system more explicit, closer at hand, and more readily searchable.

I find it ironic in the case of Python especially. “Explicit is better than implicit,” say the Pythonistas—except when it comes to types?

Most of the Mistakes

When I interviewed at Simple, I wanted to get across one very important thing—if I got hired, I would begin by making every possible mistake, but I would only make each mistake once.

I think enough time has passed that I’ve managed to make most of the mistakes I needed to get past feeling stuck. Five months in, and I’m finally feeling like I’m contributing steadily. My world at work has also widened, putting me in contact with other teams.

One of the major shortcomings of my last job was that I found I spent so much time helping with maintenance that I never got to create things. Programming actually became a rare part of my job. I spend almost every day at Simple actually programming, and I’m really thankful for that. Not necessarily because I enjoy programming in and of itself (that comes and goes) but because I get to have a say in how things work, and I get to help drive us forward.

Greater Evils, or, Where My Code Lives Now

I’m going to tear down my GitLab instance and just host my programming things on GitHub. It comes down to the ready-made community, the well-worn functionality (everything will always just work), and the ease of not having to host a complex web app.

Less soul searching went into this decision than usual, and I know I’m making the wrong decision. It’s wrong on its face, to me, but I make wrong decisions like this all the time. I’m still using Twitter, for example, even knowing my content is monetized on my behalf. I think the only thing that’s changed is my drained wherewithal to fight back quotidian evils.

If you find my blog where before you found my GitLab site, now you know why. I haven’t moved absolutely everything, so let me know if something’s missing you’d like to look at.

What I Need from a Notes App

Here’s what I want from an ideal note-taking application.

  • Runs on every conceivable platform. May or may not include a degenerate web app. This is the one thing Evernote nails.
  • Uses keyboard input intelligently. For example, a quick, judicious asterisk creates a list on the fly. A tab can easily kick off a table. Quick calculations are performed inline. This is something OneNote has done well.
  • Allows completely free-form control over what’s already input. I can select and cordon off something and pull it aside somewhere else. Whitespace expands infinitely in any direction to accommodate. Items (be they drawings, text, or whatever else) can be moved around, side by side or similar.
  • As a corollary, input with structure can be restructured. Lists are dynamic outlines that can be rearranged, re-nested, and so on. Tables’ rows and columns can be dragged around. Grippy handles on things abound to accommodate this.
  • Accepts any manner of input and handles it intelligently: audio recordings; drawing with mouse, finger, or stylus; dragged-and-dropped files, which can be inlined as images or rendered as documents if applicable.
    • Bonus points if the app can index all these things (handwriting analysis, image OCR, audio speech recognition).
  • Organizational scheme with at least two tiers above the note level. OneNote had/has notebooks, sections, and pages (notes).
  • Extremely configurable appearance of notes, easily templated. Organizational scheme is easy to configure (names, colors).
  • Preferably professionally designed.
  • Rock-solid brain-dead sync between devices, preferably with encryption on the client side.

I can sum up the above by saying that I want a large, free-form space that accepts anything from anywhere and tries to do something smart with it. I realize this is a tall order. I’m surprised to hear that this doesn’t exist, though, not even for an exorbitant price (which I’d pay for something which came close). If someone comes across something like this, let me know.

The Discomfort of Being New

It’s sort of incredible after all this time that I get so uncomfortable with not knowing all the answers. Setting aside my personal and spiritual development (“What is the stars, what is the stars?”), I also stumble over this issue professionally.

I finished my fifteenth week at the Simple last Friday, and during my short tenure, we experienced one of the most trying times in our short history. It has been a difficult time to ramp up, and I’m still pretty new to having a programming job at all. The first of August marks three years since I started at the first one. I learned a lot at my last place, but I’m probably still a little too green to hit the ground running the way I’d like.

Now in my second programming job, I’ve identified a pattern that may have more to do with me than with the jobs I find myself in. I get frustrated very quickly when I’m unsure what to do. I haven’t always dealt with it very well. I expect to sail forward without bumps. Instead, I quickly blame setbacks on lack of process, documentation, opaque code, bad tests, unfamiliar culture, and a number of other externalities. The truth has a lot to do with just being new. I can’t speed past it, avoid it, or outsmart it. There’s no other way to become a veteran than by the pain of experience.

It’s like I’m so used to being able to hand in my test first in science class, and now in calculus I’m squirming while watching the others looking breezy. I figure it’s the teacher’s fault, curse the awful textbook, and complain how uncomfortable my chair is.

If I get to a point where I can internalize the discomfort, I start beating myself up with it instead. I finally reached that point a couple of weeks ago. I began questioning myself. I don’t have any good coping tactics for this stress. I’ve found I end up swinging to the other extreme; it’s not everything around me being awful and wrong, it’s just me. I feel like a new firefighter losing control of the hose, watching the fire burn out of control and screaming apologies.

It’s neither of those extremes. It’s just being new, and it’s uncomfortable. I’m in the same boat everyone’s spent time in. Whatever I do, as long as I hold faith with the process, it’ll pass.

Older posts

© 2017 Emily St*

Theme by Anders NorenUp ↑