Many software engineers are familiar with the usage of special comment “tags” that can be added to their code comments, for use in searches, automated task tracking, and so forth.  Some of the most popular are FIXME, TODO, UNDONE, and HACK.  But what about others?  I’m sure plenty of people have developed other tags of good general utility that deserve to be shared.

So here are a few additional tags that I’ve adopted over the years.  Some are useful standalone, others are most effectively used when combined with more common tags (like FIXME).

YOUAREHERE

Easily my favorite new tag of the past couple years.  Before you go home for the night (or otherwise stop working, if you’re already home), drop one of these wherever in your code you’re currently working, followed by a one-paragraph braindump of your current thought context.

Example:

/*
    YOUAREHERE The performance here kinda sucks, but I'm not quite sure
    why yet.  The initial profile indicated that accessing the frobnitz
    via the foobar interface is taking 5 ms, but we're barely doing
    anything in there, or at least we're not supposed to be.  Need to
    dig into foobar a bit more next, see what's going on there; check
    with Greg if there's a known issue first.
*/

Just write a quick summary of what you just did, what the current issues are that you’re thinking about, and what your probable next step should be.  Usually around 5-10 lines of comment text should be sufficient.  Save the file (don’t check it in!  We should all know to avoid end-of-day check-ins anyway; YOUAREHERE doesn’t change that), and then once you’ve saved it… walk away.  You can do unrelated stuff like check email or whatever, but don’t mess with any more code for the day once you write your YOUAREHERE comment (if you do then you’ll have to rewrite it later, so by definition the YOUAREHERE comment only works if it’s the last comment of the day).

Doing this helps you disconnect from work, so you can fully engage in other parts of your life (family, hobbies, etc.) and then get a good night’s sleep.  I’ve found that I am much less likely to have my brain endlessly (and usually unproductively) spinning on a work problem in the middle of the night if I’ve left a YOUAREHERE comment, because a lot of that spinning is driven by subconscious worries about forgetting my thought context.  When that context is preserved, those worries go away.  Moreover, by keeping my mind from excessively ruminating on a problem, I allow the right-side of my brain to work on the problem more freely in the background (often providing better results spontaneously) while I’m away from my desk.

The next day when you get into work and are ready to start dealing with code again, open up your editor (if it’s not open already), find your YOUAREHERE comment, read it thoroughly, and spend a minute or so letting all the relevant context page its way back into your mental cache.  Afterwards, you’ll feel like you’re “right where you left off” and can safely delete the YOUAREHERE comment, and carry on.  Very handy!

TESTTEST

Doing a code search for “TEST” comes up with way too many hits, but “TESTTEST” is considerably more rare.  Use this for temporary test code that you don’t need/want to turn into a more permanent unit test, you just want to quickly verify that something basically works.  Having a special tag like this reduces the likelyhood that you’ll accidentally check this stuff in, since you can spot it quickly during a pre-check-in review).

VOODOO

Best used in combination with FIXME or HACK, the VOODOO tag conveys an extra sense of fragility that needs to be respected.  VOODOO code is code that ostensibly works, but for the time being you don’t necessarily know why (even though you wrote it!), and for that reason it’s not to be trusted.  You need to take these code blocks seriously when working in their vicinity, and any attempted direct fixes should be done with extreme care.

Ideally VOODOO blocks don’t last very long (we should all completely understand the code we write, yes?), but code doesn’t work in isolation; it often calls into other code which is potentially insufficiently documented or understood (especially problematic if it’s closed source).

I’d be lying if I told you that every one of these VOODOO blocks gets fixed before ship.  Nobody should want to remain ignorant, but once we get into areas like weird graphics driver behavior, these kinds of situations do come up.  Having a dedicated tag like VOODOO indicates that the ability to provide a true fix may be beyond the boundary of available knowledge, and that one should tread carefully as a result.

Used very sparingly, for obvious reasons.  Junior engineers using VOODOO should try and clear them out with a more experienced engineer before checking in code with them.  Senior engineers using them should try to find an appropriate middleware or hardware DevRel to talk to.

Note: Never, ever use VOODOO in connection with multithreading related code (i.e. VOODOO that “fixes” a deadlock etc.).  That’s not VOODOO code, it’s a time bomb.

WART

Old code that is being preserved “for reference”.  If you have a usable version control system (you do, don’t you?) you should clear WARTs out before check-in, unless it’s combined with another tag like TODO or FIXME indicating a potential loss of functionality that you intend to regain soon.  I need to remember to use this one more often. 🙂

Ranked FIXMEs

This is a variant on our old friend the FIXME.  A number of years ago I noticed my friend Scott Bilas would use sequences of dollar signs in his comments ($, $$, $$$, …) to indicate things that needed attention; more dollar signs meant a higher priority.

I liked the principle, but having the dollar signs alone was a little bit less context than I cared for.  So I started combining it with FIXME, resulting in comments like this (note that the “CDH” is just my initials; I have a personal rule of attributing my FIXMEs):

// CDH FIXME$$$ data needs to be sorted to be used more easily in editor

Again the number of dollar signs is based on subjective priority, but my (very rough) ranking has been something like this:

  • FIXME$ : Don’t really care that much, I doubt I’ll ever be idle enough to fix these, but it’s good to make a note.
  • FIXME$$ : Eh, this actually may need some attention at some point, but it’s probably not too pressing.
  • FIXME$$$ : This probably needs to get fixed before ship.
  • FIXME$$$$ : Should be fixed before the next milestone.
  • FIXME$$$$$ : You should fix this before checkin.
  • FIXME$$$$$$ : Fix this today.

I’d be lying if I said I held to this strictly (there are some FIXME$$$$$ I’ve written that have been checked in and probably still survive months or years later), but I attribute that more to my own imperfect-but-slowly-improving ability to recognize an appropriate priority for these issues, rather than an inherent deficiency in the ranking concept itself.

In any case, it’s nice to be able to search for “FIXME$$$” and get only FIXMEs ranked three-dollar-signs or higher.  By the way, I use dollar signs because I work primarily in C++, where $ is infrequently used.  For other languages that do use it (like Perl etc.), you’re better off choosing another symbol with equivalently rare use in the code itself.

Enough For Today

Got some good tags of your own?  Share them! 🙂

I’ve talked about this idea informally amongst colleagues for a couple years now, and this weekend I thought it might be a good idea to open it up to a wider audience.  It’s a principle of planning and consensus-building that I’ve held on to for a while; hopefully you find it somewhat useful.

To set the stage, I’d like to start with a little story.

Two guys from the East coast of the U.S. are driving Westward on a cross-country trip.  On their first day of driving, they reach a large city in Ohio before realizing that they’re lost.  They figured that it had been a long first day, and decided they would get a hotel in Columbus for the evening before continuing their journey in the morning.

Now they just had to figure out how to get to Columbus.  The driver insisted on taking a northbound road, while the passenger insisted on a southbound road.  Back and forth they bickered, on and on for minutes that seemed like hours.  “North!”  “South!”  “North!”  “South!”  Combined with how tired they already were, their arguing started becoming so heated that it almost seemed like it would come to blows.

And in a moment of their arguing, a realization simultaneously dawned on both of them… did they know what city they were currently in?  They had gotten lost after all; did they actually know, for certain, where they were?  It turns out that no, they didn’t, and moreover the driver was assuming that they were currently in Cincinnati, while the passenger was assuming they were in Cleveland.

The moral of the story: If you want to agree on where to go, you must first agree where you are.

The acronym STITA stands for Status, Then Intention, Then Action.  The order is important.

Status

The role of Status as a serious point of discussion is overlooked in many planning situations, because it’s often easily assumed, and not exactly “forward-thinking” either.  Talking about Status is not necessarily very “fun”.  But you must reach consensus on Status before you proceed, or people are going to have differing views of reality that they use to inform their subsequent plans.

Status is about Now.  To reach consensus on Status, you must sit down with stakeholders and agree not to talk, at all, about what to do next.  The purpose of a Status meeting is to reach a shared conception of brutally honest reality, where everybody involved knows both the joys and the burdens surrounding your current state of affairs.

Are you behind schedule?  Admit it.  Do you have hard financial concerns?  Talk about them.  Are there relevant problems of your own that you need to admit to?  Own up to them.  Are the core values of your business/group suddenly called into question?  Reaffirm them.  Get everything on the table.  This isn’t about optimism vs. pessimism; both of those perspectives are distinguished by how people approach their response based on a common frame of reference.  That common frame must be reached first, and it is a critical part of planning.  Reach consensus on Status.

Intention

The next stage of planning is to reach consensus on Intention.  This is called out because it is often easily conflated with Action, and they are not at all the same thing.

Reaching consensus on Status results in a common assessment of your current “position”.  From there, your Intention is about your desired “direction” away from that position.  You must reach consensus on Intention before any truly meaningful action can be taken.

What are your actual goals?  What are you really trying to accomplish?  This isn’t about the “how”; you’re not concerned with the means yet, only the motive.  Get everybody involved to agree on your common intentions, where you see yourself being once your resulting plan, whatever it turns out to be, has run its course.  Do not be concerned about strategy or tactics here, just your goals.  Reach consensus on Intention.

Action

Only once your Status and Intention are agreed upon, is it time to start talking about Action.  You know where you are, and you know where you’re going; now, how should you get there?  This is what planning meetings are often supposedly about (think about the phrase “action item”), but the conflation with Status and Intention often makes the process unnecessarily ugly.

A shared Status and Intention go a long way towards creating a true common vision among stakeholders, and that common vision can elucidate the right Action to take much more easily than when all of these stages are muddled together.  When everyone has the same vision, they can become truly invested in their shared goal, and collective buy-in becomes possible.

From that place, reaching a consensus plan of Action becomes not only easier to achieve, but the resulting plan itself becomes much more realistic, because all the underlying conflict due to competing goals has already been laid bare.  Reach consensus on Action.

STITA : Status, Then Intention, Then Action.  In that order.

 

(This is a repost of my other AltDevBlogADay posting on 2011.03.27, entitled “Rhetoric for Engineers”; I’m changing the title slightly here since the original name matched a book title, which made searches tougher.  Original location : http://www.altdevblogaday.com/2011/03/27/rhetoric-for-engineers/ )

Today I’m going to talk about rhetoric, a.k.a. the “art of persuasion”. If you’re an engineer/programmer, this is probably something you don’t normally think about; in fact the very idea of the subject may leave a nasty taste in your mouth. If so, then you are exactly the kind of engineer I’m talking to.

Whether you like it or not, the art of persuasion affects you. It affects interaction with your peers, it affects the lasting impression your work has on others, it affects your career growth (especially if you’re thinking about leadership or management at all in your future), all kinds of stuff. So it’s worth talking about.

Now I’m not going to sit here and go into long tutorials about the Western philosophy of rhetoric, diving into logos, ethos, and pathos and all of that. From an engineering point of view, all of those issues are “implementation details”; you can read about them at your leisure. Instead, my goal here is to get you interested in the subject in the first place, and that starts with learning to respect the subject.

Respect for rhetoric doesn’t come lightly to many engineers. It’s usually seen as the domain of salespeople, marketing, and lawyers, all groups that engineers usually barely tolerate and sometimes outright despise. There is a philosophical chasm that needs to be crossed in order for these two sides to see eye to eye, but it can be crossed.

To set the stage, I’m going to bring up a little book by Neal Stephenson called “Anathem“.

Anathem by Neal Stephenson

This isn’t actually a “little” book; it’s almost a thousand pages long. Some of you may have read it, but if not, don’t worry, I’m not really going to get into spoiler territory here. Rather, I just want to bring up one small side-plot that goes on in the book between two schools of thought. These two schools have an ongoing feud between them, rooted in their beliefs about the nature of symbols (I’ll use my own names for these schools just to reduce any spoiler potential).

The Semanticists believe that symbols have inherent meaning, independent of any particular conscious observer. This meaning transcends the particulars of the specific syntax used to denote the underlying symbol which is itself entirely objective and universal in nature. Followers of this school of thought go on to become scientists, mathematicians, engineers, and so forth.

On the other hand, the Syntacticists believe that symbols have no inherent meaning beyond that which is projected upon them by conscious observers. To them, many abstract interactions are ultimately nothing more than games of syntax, and the syntax itself is the only thing that’s known to be real. Those who come from this school become litigators, communicators, and politicians.

These two schools have Western philosophical analogues in various spheres of ontological debate, such as Realism vs. Nominalism, Plato and his issues with the Sophists, that kind of thing. Again, you can get into the philosophical backstory at your own leisure; I’m certainly not an expert on the subject, but thankfully it’s not necessary here since their counterparts in Anathem are sufficient for our discussion.

Anyway, it turns out that the masters of each of these two schools of thought realized that each side had dominion over half of the time line. The Semanticists were able to manipulate the future, while the Syntacticists were able to manipulate the past.

I’m sure plenty of you are familiar with the “light cone” concept as explained by cosmologists like Stephen Hawking:

Light Cones

In this model, the Present is a point, the Past Light Cone is all possible timelines that could result in the Present, and the Future Light Cone is all possible timelines that could come out of the Present. To the degree that quantum uncertainty allows, the timeline is malleable within these two cones, and the Semanticists and Syntacists in Anathem essentially figured out how to do so.

The catch, as it turns out, is that even though both sides have dominion over one cone, they still have to work together if they want to avoid discontinuities at the present.

So enough of Anathem, let’s bring all this back to Earth. What does this have to do with us? Well, think about what scientists and engineers do, and what politicians do, and you’ll see how this idea isn’t really too far off the mark. A single mathematical or scientific discovery or invention can have a radical impact on the future of everyone on the entire planet. And the right words spoken in the right places can have a radical impact on what we know our history to be.

In other words, both sides have power, and that power is a delicate balance against the fulcrum of the present moment. If we, as engineers, act like rhetoric is somehow beneath us, it reveals a lack of understanding of this critical balance in the temporal order of things.

To put it another way, we can look at the masters of these two sides at an archetypical level.

Ascetic and Speaker

The archetypical Semanticist is the Ascetic. To them, symbols have meaning and point towards ultimate truth, and as such the ascetic prefers to spend their time focusing on that truth rather than messing with syntax. The syntax is only a necessary stepping stone, to be utilized minimally and only as much as is necessary to point others to that same essential truth, at which point the syntax again becomes unnecessary. For those fortunate enough to hear such a master speak, every word is nectar and worthy of deeper contemplation. Sometimes it’s in the form of an elegant equation, other times a Zen koan, but always riddled with meaning.

On the other hand, the archetypical Syntacticist is the Evangelist. Sometimes in the form of a politician, sometimes an actor, but always using their words and their presence to influence the way you see the world. To them, the truth is entirely defined by how you perceive it, and thus if they can alter your perceptions, they can alter the truth. Their words are often shallow, but persuasive, and if they can get you to latch onto their position even slightly, then self-reinforcement behavior will take over and they’ll have you on their side, improving their position even further.

In the cosmos of human interaction, both of these archetypes are like black holes:

Black Hole

The Ascetic is an attractor of limited range but incredible density. Not many people fall into their sphere of influence, but those who do can immediately and palpably sense the depth of their knowledge, wisdom, and intuitive understanding of the universe. Once someone has arrived at this place of understanding, they generally don’t want to leave.

The Evangelist is an attractor of less depth but with massive reach. Because their goals are often selfish, the focus of the attractor tends to be on the Evangelist personally, rather than on any transcendent truth being pointed to. To counter this reduced depth, the Evangelist is incredibly mobile, modifying the position of the attractor at will and bringing all within the wide sphere of influence along for the ride.

Now again, like it or not, these two have to work together, to avoid discontinuities at the moment of the Present. When someone manipulates the future by inventing something that is profoundly impactful but unexpected by most of the populace, it is the obligation of the rhetorical disciplines to manipulate the populace’s impressions of the time before the invention in such a way that the moment of the invention itself was not only not unexpected, but rather completely obvious and inevitable.

This is going on all the time.

To free yourself from manipulation and allow yourself more control over your own Present moment, it is important to try and gain mastery over both the past and the future, over both syntax and semantics. Let’s consider again the case of the black holes mentioned above.

If you can understand both the Ascetic and the Evangelist, and can wield both of their disciplines together, you effectively become an attractor with great depth and wide reach. You become a person capable of bringing more people to ideas of significance, which is empowering for them and liberating for you, because these are ideas that you already have, they’re just begging to get out.

In any communication, the content of what you say, the semantics, will influence how people plan their future actions, however the way you say it will determine how people will remember and feel about this communication in the first place. Independently each can be useful, but if you combine the two together effectively with smooth continuity, you get something that can convey genuine inspiration, which only happens when everything is aligned in the Present moment. Inspiration is a deeply seated thing, and can be a bigger carrier for your ideas than any amount of rational explanation alone.

An engineer capable of persuasion can be a mighty force indeed!

Like any other skill, rhetoric is one that takes a long time to master, and I’ve certainly got a long way to go myself. But even the smallest steps can pay massive dividends, as I found out personally when I joined up with Toastmasters (http://www.toastmasters.org) a couple years ago. I only had time to attend for a few short months before leaving due to scheduling conflicts, but even that little bit of exposure to the world of rhetorical thinking (focusing not on what you say, but how you say it) had an enormous impact on my day-to-day life.

Rhetoric is not inherently evil, it’s simply another kind of power, one which many engineers could greatly benefit from. Use it, and use it for good! 🙂

(This is a repost of my 2011.03.12 AltDevBlogADay posting of the same name; I only wrote two posts for ADBAD and figured I would bring them over here to help frontload the blog, ’cause hey, why not.  Original location : http://www.altdevblogaday.com/2011/03/12/perfect-data-pipelines/ )

Greetings everyone!  Time for my first post on this wonderful shared blogging experiment we’ve got going on here at #AltDevBlogADay.  Now, what to talk about?  Decisions, decisions, decisions… how about a few words on a subject close to my heart these days: Data Pipelines!

Of course, a topic like this is huge and not something that can be summed up in a single post (nor will I try to do so), but since I’ll likely be talking about this kind of stuff a lot, there needs to be some place to start.  With that in mind, I’d like to talk about one possible vision of a “perfect” pipeline, since if we have an idea of what perfection could be, then we can more easily identify where our current pipelines don’t match that ideal (and cause us all kinds of annoying problems as a result).

Are you seriously suggesting you know what “perfection” is?

Certainly not; I’m as human as anyone else.  But I’ve noticed that whenever I find myself stuck on something (a problem in particular or really just life in general), it’s usually been really helpful to direct my attention towards where I’m headed, rather than dwelling on where I am right now.  Of course, before you can direct your attention somewhere, you have to know where that somewhere is.  Simple enough of a concept, but something most of us don’t really utilize as often as we could, perhaps because identifying where we really want to go can often be very difficult.

Since I’ve spent a lot of time working on asset processing pipelines in my career (and made plenty of mistakes along the way), at some point I managed to distill down what I’d really like the pipelines I create or work with to look like.  Some get closer than others; the practicalities of the business admittedly make some of these efforts prohibitive (and there are diminishing returns for sure)… but it’s still nice to know where improvements could be made if desired.

I’ve had a mental sketch of this kind of idealistic pipeline architecture in my head for a while, but previously it had only ever left my head in the form of various scribblings on whiteboards or post-it notes.  So I decided to go ahead and flush it out a little, add some excessive captions, and lots of pretty rainbow colors:

Data Pipelines

Click to open full-size image

The central idea surrounds the concept of what I call “tiers” of the pipeline.  This name falls in the time-honored category of “vaguely-defined metaphors which nobody can agree upon”.  Previously I’ve used words like “stage” or “phase” for this kind of thing, but the problem with those terms is that they have strong one-directional overtones to them (and not surprisingly many of my pipelines have been horribly one-directional in the past).  So for this, I really wanted a term without that kind of mental baggage.  I’ve always been fond of the “ring” metaphor for processor privilege levels (“Ring 0″ for kernel code, “Ring 3″ for user mode code, that kind of thing), and there seemed to be a weird kind of correlation to this situation, so I wanted a word with a similar vibe.  Hence, “tier”.

In the image above, there are eight of these tiers, each representing a basic role that data in the pipeline can fall into.  Tier 7 covers anything conceptual and above the automated pipeline.  Tiers 5 and 6 are authoring entry points into the pipeline, low- and high-level respectively.  Tiers 3 and 4 are intermediate representations, one runtime-directed and one author-directed.  Tiers 1 and 2 are runtime-compatible, the former targeted for efficiency and the latter for diagnostics.  Finally, Tier 0 is solely targeted towards the hardware and makes no allotment for the existence of human beings whatsoever.

On the right side I mention the audience; the higher tiers are focused on productivity, while the lower tiers are focused on performance.  Both of these extremes are worthy of respect, and worthy of optimization.

On the left side I’ve used a couple more metaphors worth talking a bit about.  Any data pipeline has a process of moving data from the higher tiers downwards; depending on the scenario it’s variously called things like “compiling”, “building”, “cooking”, and so on.  In order to stay one step removed from these kinds of names (again trying to avoid some associated mental baggage), I’ve adopted the term “freezing” for this process.  There is also a corresponding process called “melting” which goes in the opposite direction.

How does this help me?

Just as it is the tool chain’s job to work with processing data between tiers, it is the job of the authoring tools, diagnostic tools, and engine subsystems to work with the data as it exists on individual tiers.  If you’re writing an engine subsystem for example, then it is helpful to keep in mind which tier it primarily works with, so you can write the code accordingly and not make later optimization efforts difficult.

For example, a highly object-oriented subsystem may make a lot of sense to human beings, however it may have some nontrivial performance tradeoffs.  If you are writing engine code that services data at Tier 0, then the audience for that code and data is the hardware, not human beings.  At this tier, when you have a choice like using an Array of Structures (AoS) vs. a Structure of Arrays (SoA), you would probably want to choose the latter because it’s better for the hardware, even though the former may be more intuitive to a human.  This is the kind of mental mindset you’d want to be in whenever you’re operating at Tier 0, and in fact even the basic act of saying to yourself “This is Tier 0 code, I know how it has to be written” before you write anything there can go a long way in preventing a variety of low-level architectural mistakes that can be hard to repair further down the line.

In contrast, code working with data at higher tiers is all about workflow, and you wouldn’t want rigid hardware-centric concerns making a confusing mess of things at that these tiers.  It’s all about making sure your code and data for a given tier work together for the right audience, and keeping the lines between tiers well-enforced so that you don’t have a confusion of concerns.

Turtles All The Way Down

One important concept this picture does not really convey (except for a brief mention in the Freezing text) is the idea that a pipeline as a whole can serve as a platform for supporting other pipelines.

For example, your tool chain that converts data between pipeline tiers is not actually part of any one tier itself, but rather makes up the platform that the entire pipeline is based on.  Thus, just because a tool may be converting data between Tiers 5 and 3 doesn’t mean it should be written inefficiently; within the platform of the tool itself, that conversion code is running at its own Tier 0.

If that sounds confusing, think about the example of code itself.  The code you write may help to define a platform for your game/tools/etc, but if you go down a level of abstraction, that code is itself just data (text) fed into another pipeline at a high tier (Tier 6 for some languages, Tier 5 for others).  That pipeline is being supported by a platform of a tool chain called a “compiler” and all its various stages, eventually freezing down to a bunch of Tier 0 data we call machine code (plus ancillary tools such as debuggers which support a rudimentary “melting” process for diagnostic situations).  It’s essentially the same thing, just a different platform.  And then of course that compiler itself was originally written as a bunch of code… and I’m not even gonna start talking about the platform of the hardware itself… hopefully you see what I’m getting at.

I mention all of this because just as one must be aware of what tier they’re working with at any given time, one must also think about which level of platform abstraction (i.e. which data pipeline) that the tier resides in.  Mixing tiers can be messy sometimes, but mixing platforms can be even messier.  Gotta keep those mental models straight. 🙂

Stop!  Freeze!  Enhance!

Imagine, if you will, the following scenario:

You’re playing your game, all of a sudden you notice an asset problem.  You’re effectively working at Tier 0, so if you want to change something you’re going to have to go to a higher tier.  You pause the game, identify that asset, melt it back to a tier that you can do something with (either a direct melt if possible, or if not, then via some automatic mechanism that fetches the original higher-tier data).  You modify it, freeze it back down to Tier 0, unpause, and go on your way.

Many of us have spent a lot of time and effort getting our games and toolchains to effectively act like the above scenario.  We strive for an ideal balance between efficient workflow and efficient execution.

Now let’s take it one step further.  Another scenario:

You’re playing your game, all of a sudden you notice a bug that’s probably caused by a certain subsystem.  Once again, you’re at Tier 0, so you have to melt a bit.  You find that block of code, melt it back to Tier 2, so you can debug it better; everything else is still running at Tier 0.  You unpause and run again for a bit, stepping through until you identify the source of the problem.  Then you pause again, melt that code back to Tier 6, change it, freeze it back down to Tier 3, and step through again.  Looks like it’s working now, so freeze all the way back to Tier 0, and continue on with your game.

We’re also collectively moving towards this scenario as well.  Tools such as scripting languages, compilers with good Edit & Continue support etc. are aiding this process, but we’ve still got a long way to go.  There are researchers out there who are working towards this kind of ideal, but diving into their efforts would probably take an entire post of its own to talk about, so… maybe next time. 🙂

For now, my primary intent is just to get peoples’ mental gears turning about how their pipelines works, and spark some curiosity about what could be done to make them even better.

Okay, Enough Already

That’s probably good enough to start things off with.   I’ll be talking more about pipelines and lots of other things in the future, but for now please feel free to chime in with your thoughts in the comments.  If you’re doing something awesome with your pipelines (code, assets, whatever), or you have an even more awesome vision of a perfect pipeline architecture, please share!  That’s what this is all about. 🙂

Here’s a fun little concept that a couple of us at the office came up with a few days ago; this is one that engineers (and maybe producers / PMs) would probably like. :)You may already be familiar with the MIT lab idiom “yak shaving”; if not, please check that out first, it’s a gem of its own:

http://sethgodin.typepad.com/seths_blog/2005/03/dont_shave_t…
http://projects.csail.mit.edu/gsb/old-archive/gsb-archive/gs…

With that in mind, a couple of us at work decided that this can be formalized a bit more, in the spirit of Big-O notation for algorithmic complexity.  Introducing:

Big-Y notation : A relative measure of the likely amount of Yak Shaving (YS) that would occur for a given task, relative to how cumbersome it is to do.

Some examples:

Y(1) : Constant YS complexity – Doesn’t matter how cumbersome it is to do this task, the YS is the same regardless.  Usually implies an automatable task that you absolutely have to automate to do it even once (because of the intractability of doing it by hand etc.), but that once you’ve automated it, you can do it as many times as needed without any real effort.

Y(log n) : Logarithmic YS complexity – The more cumbersome this is, the more the YS scales up, but mostly in the beginning; it thankfully starts to taper off pretty quick.  Once a certain level of cumbersome is reached, the impact on the YS becomes mostly insignificant.  e.g. “If I have to do this thing only once, I’ll just do it by hand.  If I have to do it 3 or 4 times, I’ll work on a small script to speed up a couple parts of the process.  If I have to do it 20 times, the script won’t be enough, but I can write a tool to help a bit more, even if it’ll still require a bit of handholding.  If I have to do it 1000 times, I’ll work on making that handholding go away, and then we can do it as many times as we need”.

Y(n) : Linear YS complexity : The YS scales up directly proportional to how cumbersome it is.  We can call this “the line of distraction”; any complexity less than Linear will level out eventually to a general solution, and any complexity more will just make things worse as you progress.  Linear YS complexity itself is thus a bit more rare, but can be identified by situations such as “wow, this thing is kinda annoying… what if I just added Unrelated Feature X?!  I bet that’ll make this easier to deal with, yeah that’s it… I’m certain it will…”

Y(n^2) : Quadratic YS complexity : The problem is cumbersome, but the resulting YS involved makes your solution ultimately more cumbersome than the original  issue.  Problems are given ineffective solutions, thus causing more problems.  Occurs often with engineers who haven’t yet gotten out of the “architecture astronaut” phase of professional development.

Y(c^n) : Exponential YS complexity : Like quadratic complexity, but worse, because the “solutions” that people come up with as a result of YS are now so cumbersome to deal with, they have YS of their own, so the problem has now become compounded.

Y(n!) : Factorial YS complexity : Reserved for unfocused, open-ended side projects at home.  You know what I’m talking about.

Anyway, this seemed like something the game development crowd might like.  “Don’t try to jam multiplayer support into that old framework, it’s at least quadratic yak shaving complexity and you know it!” 🙂  Less jokingly, if you can reasonably state that as a combination of task difficulty, lack of clear goal definition, and human nature, that a task will be Y(n) or greater, you should probably work on scoping the problem better and doing a bit more planning with constrained outcomes, in order to reign it back in as much as possible.

Big-Y notation! 😀
A clarification on the use of the word “cumbersome” : This isn’t necessarily about the task being repetitive, it’s more about the likelyhood that you’ll lose focus in doing it, because it’s no longer interesting.  Repetition is an easy example because it can cause loss of interest quickly, or a non-repetitive but ill-defined problem can do so as well; both cause loss of focus.  The reason for Big-Y is because certain kinds of focus loss have a low impact, while others have a very high impact; the notation helps to narrow down the combined risk of drifting intentions once the technical and human factors are all accounted for.  We originally came up with it in humor, and it is admittedly fairly loosely defined, but there’s a grain of something genuinely interesting in it, hence why I wrote it up.

And now for a related Bonus term:

Ygenvector(pronounced “weigenvector”) : A task which when performed, and accounting for both the negatives of time lost and opportunity cost, plus the positives of productivity gained and product manifested, results in a net operational benefit of Zero.Again it seems silly on the surface, but in reality Ygenvectors (and other tasks approaching them) are incredibly common, and incredibly dangerous.

When a task is completely unrelated to the goal, it’s not a Ygenvector because the time loss puts it squarely negative.  In other words, Ygenvectors actually are related to the ultimate goal you want to accomplish; the problem is that their gains exactly offset their costs, meaning you’ve ultimately done nothing.  The fact that this doing-nothing seemed “related” means that this nothingness could be hard to detect (or admit to oneself), making them very deceptive.  You genuinely think that you’re doing useful work, but you’re actually not.

Just another one to think about.

Just getting this obligatory thing out of the way.  Insert introductory autobiographical text here, blah blah blah “hey I’m setting up a blog now” blah blah blah you’re not here to read this stuff anyway, so let’s get to some more interesting content, shall we?

TL;DR: Hi.