Engineering Game Development

Lee Winder, Technical Manager at Blitz Games Studios on Software Engineering and Game Development

I originally posted this to #AltDevBlogADay on Sunday November 27, 2011.

COMICS + INFORMATION DESIGNSuccessful communication is one of the most important aspects of a well functioning and productive team. Without good communication between peers, managers, publishers and anyone else involved in the game development process a team will not perform at it’s best.

If developers do not feel confident in the reasons behind their work, if they don’t fully understand how their work will fit into the project as a whole or indeed when it will be required, the team will produce lower quality work with last minute changes and requirements fostering an atmosphere of distrust and crunch.

But communication is one of the most difficult things to get right but so costly when it’s done wrong.

The following are methods I’ve used over the years to try and improve communication within the team. I’d be very interested to hear other ways people have tried too!

 
Team Wiki
Having an open Wiki that people can contribute information and notes is a good idea for documentation and persistent information. It is not a good tool for perishable short term information. Documentation on team processes (getting the latest build, creating submissions, setting up PC’s etc.) is usually the kind of information that finds a home on a wiki and while it’s useful it’s not something that affects the team on a daily basis.

And as it requires people to physically search for the information in the long term people don’t bother looking for new information on a regular basis.

As a result, the Wiki is useful but doesn’t actually improve the day to day communication on a team.

 
Blogs
We have a team blog that people update about 2 or 3 times a week, usually discussing their recent work, posting up screen shots or letting people know the state of the project. It’s a nice simple way to push information to the team, though it does require everyone to contribute to the blog to get good cross team communication going.

Discussions can take on a life of their own which is actually a good way to gauge buy in into a project but can’t be used to judge the success of the process.

But you’d be amazed how many people don’t have any kind of RSS feed reader set up…

 
Micro-blogging
Internal mico-blogging tools like Yammer or Status.net allow people to quickly thrown up what they’re working on, problems encountered or general team information. The best thing about micro-blogging is it’s push communication style with peoples updates being automatically feed to clients which people can update as much as they want (I usually recommended at least twice a day).

But so far I’ve had very little success with micro-blogging tools in a team environment.

Not because the idea was bad (when it worked it worked well) but I’ve yet to find a service where the official client is anywhere near usable and able to filter out the information people don’t want to read. Without a good way to filter and push information where you want it (like all the best third party Twitter tools do) it either becomes an information overload or a sea of noise, neither of which improve communication.

 
Wall Displays
Walls are valuable real estate especially in an Open Plan office and I’ve rarely seen them used to their full potential. But highly visible walls in the middle of a team area are one of the best ways to communicate information across a team.

As an example on my current project we have the entire timeline of the project (it’s only a short one) with dates/deliverables clearly indicated, a ‘where we are right now’ marker and a description – feature by feature – of what is required for a given milestone.

Next to that we have our sprint wall which is our most ‘live’ wall display. But position is key and in our case the sprint wall’s impact on the team is reduced due to it’s rather awkward position between a big TV, a constantly spinning fan and quite a lot of server machines. But I did say wall space is valuable real estate and it’s always hard to find a compromise between distance from team and accessibility.

 
Morning Meetings
Morning meetings are one of the best ways to push information across a team but I’ve found that you need to follow a few rules to make them really valuable.

  • Keep the groups small. I’ve lost count of the number of 20+ people stand up meetings I’ve seen where the majority of the ‘participants’ are looking bored or simply waiting for their turn. If you’re groups are not small, the information is less targeted and much less relevant, meaning more information is lost than actually passed around the team.
  • Keep them focused. There is nothing worse than 1 out of the 6 people speaking for 15 minutes about the most intimate implementation details, leaving everyone else itching to get back to their seats to carry on working.
  • Don’t make them reporting sessions. If everyone is talking at a single person (usually their manager) take the manager out for a while and get people used to talking to each other as it makes it much more likely for people to take in what is being said.

 
Milestone Reviews
I love the concept of a milestone review. Everyone playing the game, lively discussions about what went right, what went wrong and what we can do better next time. But it’s easy for these to be less than stellar if not approached in the right way.

If these reviews are not focused, maybe even as structured as a schedule or points to cover, people may start to feel unsure as to what they can comment on or what exactly they should be doing. You’ve also got to make sure that people feel comfortable both giving and taking criticism and manage the situations when that goes pear shape (and sometimes it will).

I’ve found that when done right, when people contribute to discussions and when people can (importantly) see change as a result of these reviews, the quality of information coming out of them is invaluable. It also has the added bonus of making people feel like they are making a difference to the team and allowing their thoughts to be heard.

 
Sprint Planning
The days of managers sitting in a room building up a schedule and dishing it out to the developers is (almost) long gone. And there’s good reason for it.

Getting a group of developers (again with the morning meetings it needs to be small and focused) to discuss, schedule and plan the work ahead significantly improves the knowledge people have of what is happening across the team. Again, if people feel that have a say in how work will be implemented, how it will be assigned and when it’ll be done by is vital to spreading information about the project and the work being done.

 
So those are a couple of methods I’ve used to try and improve communication and information across the team. I’m sure there are plenty more (and I’ve tried a few which have been colossal failures) so what methods have you used and how well did they work out?

 

Title image by deathtogutenberg.  Used under license.

 

I originally posted this to #AltDevBlogADay on Saturday November 12, 2011.

That's Enough

Refactor: Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior.

 

I pretty much wish that word had never been invented. The above definition (taken from Martin Fowler’s Refactoring Home Page) seems to have lost nearly all meaning when used in day to day programming conversations.

To further describe the original definition of refactoring:

Its heart is a series of small behavior preserving transformations. Each transformation does little, but a sequence of transformations can produce a significant restructuring.

 

But when someone comes to me and says “I’ll just refactor that…” I can no longer assume to know what they’ll be doing. Based on how people now use this term it could be any one of a number of things including (in order of violence)

  • Making small changes to make the system more understandable, simpler and easier to use without affecting it’s perceived behaviour
  • Optimising elements of the system which should hopefully(?) not effect the external behaviour
  • Making wide ranging changes to a system which will effect elements of its behaviour
  • Deleting the whole thing and starting again without even looking at what we have now

It seems like there is a reluctance to use the words ‘re-write’, ‘modify’, ‘break’ or (in a some cases) ‘trash’ when discussing intentions towards an existing system. As a result what was a very descriptive and clear term has lost all meaning and resulted in something what can no longer be used to clearly describe a useful and often necessary development process.

Do I wish the word hadn’t been invented? Maybe I just wish it hasn’t taken on an image that ‘refactoring’ is somehow cooler or more elite than simply saying what you mean when describing what you’ll be doing.
 

I originally posted this to #AltDevBlogADay on Friday October 28, 2011.

SpookyI recently came across an interesting post over on Agile Game Development titled ‘Scrum Prohibits all Specializations’. The part that stuck me was the following:

I understand that Scrum has been applied mainly to software products and that the elimination of “specialties” means that the database programmer, UI programmer and QA engineer should all be able to perform each other’s roles equally. This is valid.

 

Now I’m concerning myself with only the technical side of an agile team but I’ve seen this raised in a number of different agile circles. In those cases there seems to be the impression that swapping a database, physics or audio developer with any other specialisation like UI, animation or graphics and an agile team should be able to roll up their sleeves and perform the different roles to the same level with the same level of outcome.

To me, this is emphasised in how the product backlog is often used, which is a priority and risk ordered document that doesn’t take into account the skill set of the team that’ll be working on the final product.

Processes such as pair programming, constant re-factoring and code reviews (to name but a few) seem to be seen as ways to not only communicate intent and project information but also skillset and ability across an entire discipline.

 

So What Do Specialists Bring?

But we have specialist developers for a reason. They are great at what they do, they understand the area in which they work and they know how to get the best results in the shortest amount of time. They have a passion for the area they are focusing in which usually means they’ll go a step further to research their area and keep up with developments which other developers may not have the time or the understanding to do.

By spreading your talent thin and assuming that people can fill each others shoes leads to the following issues

  • You are not respecting the knowledge, skill, experience and passion that a specialist can bring to their work and as a result not respecting the developer themselves
  • You’re reducing the impact these people can have on a team and it’s often the experienced specialists that inspire younger members of the team into an area they are interested in
  • The ability of those specialists to learn more about their area and pass that onto others is drastically reduced.
  • The ability for the team to push their development boundaries will be indirectly reduced as everyone on the team aims for the ‘generalist’ role to fit in

 

What About Pair Programming?

Now I’m a massive fan of the various agile techniques out there. Pair programming is an excellent mentoring, development and training tool but it won’t allow one developer to fit into the shoes of another. True, they will have a better understanding of the tools, pipeline and systems being developed which will allow them to fill in, but it won’t transfer the amount of background experience the specialist has.

The same goes for code reviews, constant refactoring and feature discussions. It spreads the knowledge which reduces the risk to the project should the specialist not be around when needed, but the core experience and drive that made the specialist who they are simply cannot be replaced by dropping in a new developer.

 

But Everyone Does A Bit Of Something Every Once In A While?

Of course, sometimes people do need to jump into another developers shoes (illness, staff turn-over, hit by a bus etc.) but this is not the same as expecting a people to be able to fulfil each others roles equally. We can take steps to decrease the impact this will have on the team using the processes mentioned above but it will not allow those specialists to be inter-changed as the project continues development.

We need specialists in any development field because it’s these people that can push their respective fields in directions we might not even be able to imagine. By treating them as interchangeable we might be gaining flexibility to schedule our staff, but we’re losing something far more important and vital to a development team and the products they are creating.

As I said to some one (in 140 character or less of course) when pointed out that people have done this, and even the author of the original post has done it (see the comments)

I’m sure he has done it, I’ve done similar, but it doesn’t mean we did both with the skill of an expert of either.

 

I originally posted this to #AltDevBlogADay on Saturday 30th July 2011.

Different... Quite a few years back I started developing a custom STL implementation which has eventually be adopted and used throughout our company. One of the most important things to get right when developing any kind of generic container library is the way memory is allocated. It needs to be lightweight, highly flexible and above all easy to understand so people are willing to experiment it.

But STL Allocators have a bad reputation and it’s for a good reason. They are complex, hard to understand and have some interesting behaviour that seems designed to confuse. As a result, I needed to look at how we were going to provide a custom allocation system that was both easy to use and simple to understand without restricting what people could do with them.

A Bit of History
A while back Bitsquid published a post entitled Custom Memory Allocation in C++. This excellent post covered how the BitSquid engine used an allocator model to perform memory allocation throughout their entire engine.

But the FTL requires a slightly different approach so I won’t be treading over already covered ground here.

FTL allocators are well hidden inside containers, the only control you have is specifying the allocator type which means it’s use, and how and when objects are allocated and created is completely fixed with only the allocator itself being able to effect the outcome. Because of this the allocator behaviour needs to be as customisable as possible without requiring any changes the container itself.

When the FTL was original started, it was quite small scale and only used by a couple of developers, so allocators were not a priority. The flexibility wasn’t needed so in-container malloc and free allowed us to concentrate on container creation but obviously this wasn’t going to last.

The follow just describes the various approaches we took, why we dismissed them and what we eventually ended up with.

The Initial Approach
Allocators should have a minimal over-head. What we didn’t want to do was increase the size of every container dramatically when we rolled out customisable allocators. As a result, we initially used an approach used by a couple of vendors and defined the allocator specification using only static members – removing the need for an internal allocator object.

// Completely made up code showing the use of a static based allocator
template <typename T, typename Alloc>
class vector
{
  void  push_back(void)
  {
    void* new_memory = Alloc::allocate( sizeof(T) );
    T* new_object = new(new_memory) T;
  }
};

I knew this would limit the flexibility of the allocators, but it had minimal over-head (especially using default template parameters) and wouldn’t effect those containers already being used. And besides, programmers are a creative bunch and I wanted to see what people did with this before resigning it to the scrap heap.

But while programmers were able to work around the main limitation of not having allocator state per container, they were having to jump through hoops to get the behaviour they wanted. Which made it less likely that other programmers would feel confident enough writing their own allocators and their ability to really customise their behaviour was too limited.

So we ended up adding allocator state on a per container basis, making it much easier for developers to do what they wanted though it did add at least 4 bytes per container. But I felt that flexibility and simplicity were much more important.

// Completely made up code showing the use of an instanced allocator
template <typename T, typename Alloc>
class vector
{
  Alloc m_allocator;

  void  push_back(void)
  {
    void* new_memory = m_allocator.allocate( sizeof(T) );
    T* new_object = new(new_memory) T;
  }
};

Allocator Specification
The allocator specification is complicated. While I’m sure there are good reasons for some of the decisions I wanted ours it to be much simpler. So removing the type information (since our allocators work on raw memory and nothing else), removing the rebind(!) and exception handling (which we don’t use) we ended up with the following.

typedef ftl::pair<void*, size_t>  allocated_memory;

class Alloc
{
public:
  allocated_memory   allocate(const size_t alloc_size);
  void               deallocate(void* ptr);
};

And for the basic allocator, that’s it, it doesn’t require anything else to work, but it does have one interesting aspect.

typedef ftl::pair<void*, size_t>  allocated_memory;

As we can have all sorts of allocator types, what it allocates might not be exactly what you asked for. If it can’t allocate all the memory you requested, that’s fine, it simply returns allocated_memory(nullptr, 0) but it can also return more than was requested (for example, fixed sized block allocators will do this). This return type simply allows the allocator to return not only the allocated memory, but also how much was allocated, which allows calling objects to take advantage of this additional memory if possible.

Most of the time this isn’t queried and only what was asked for is given, but it offers an additional level of information which might allow better memory usage in some containers.

So a container will most likely end up with something like the following when adding and removing elements.

// Creating a new object in a container
allocated_memory new_memory = m_allocator.allocate( sizeof(T) );
if (new_memory.first)
  T* new_object = new(new_memory.first) T;

// Losing the object now we’re done with it
old_object->~T();
m_allocator.deallocate(old_object);

This is fine and gives us a very simple entry point for allocators. But by forcing the use of placement new and the destructor call in the container itself, we are limiting what an allocator can do. While the allocators are required to return raw memory, that doesn’t mean that it has to internally. Some allocators might pre-create the objects before returning them so creation is front loaded but the forced placement new could mean we’re over-riding an object that has already been created.

Construction Functions
As a result, we want developers to be able to not only over-ride the memory allocation, but also the object creation. 99% of allocators won’t need to do this so we don’t want to add additional requirements to the allocator so instead we can create non-member, non-friend functions, specialised on the allocator, which will do the creation for us.

template <typename TConstruct, typename TAlloc>
TConstruct* construct(TAlloc& allocator);

template <typename TConstruct, typename TAlloc>
TConstruct* construct(TAlloc& allocator, const TConstruct& obj);

template <typename TConstruct, typename TAlloc>
TConstruct* construct(void* allocated_mem);

template <typename TConstruct, typename TAlloc>
TConstruct* construct(void* allocated_mem, const TConstruct& obj);

template <typename TConstruct, typename TAlloc>
void destroy(TConstruct* ptr);

template <typename TConstruct, typename TAlloc>
void destroy(TConstruct* ptr, TAlloc& allocator);

So our point of allocation/creation now becomes something much simpler and much more powerful.

// Creating a new object in a container
T* new_object = alloc::construct<T>(m_alloc);

// Lose our existing object and return it back to the allocator
alloc::destroy(old_object, m_alloc);

The default version of construct performs the allocation and placement new within the construct function, but should the allocator need something more (or less), simply over-loading the function on the allocator type allows complete control over both memory allocation and object creation. The same goes for the destroy function and the automatic call of the objects destructor.

No Base Allocator
One thing I didn’t add to the allocators was a base allocator interface. The allocators are created within the container, with their type defined at the point of the containers declaration. The addition of a base interface (and the associated virtual functions) would have increase the size over-head of the allocator which is something I wanted to avoid and something I thought didn’t add enough to warrant the overhead. I was less worried about the overhead of calling a virtual function as that would be insignificant compared to the overhead of everything else what was going on.

Conclusion
So by introducing an allocator model that is much simpler (only 2 required functions) with more extensible customisation should an allocator should need it, developers have complete control over all the memory allocation and importantly the object creation itself. Due to it’s initial simplicity, developers have no problem creating new allocators that improve both memory and container use in a project and can start to experiment with better and more efficient allocators quickly.

Title image by Squeezy.  Used with permission.

I originally posted this to AltDevBlogADay on Friday 15th July 2011.

Having a Continuous Integration server running is one of the most useful and powerful tools a development team can use. Constantly checking the state of the code, building assets which might otherwise take hours and generating stats on build quality are all really useful things to have running in the background hour after hour and day after day.

But if it’s not done with care, a CI process, while still providing some useful information, will stop being an important part of a development teams tool set.

The main problems usually stem from a single CI step taking to long. For example, it might take hours to build all the game assets or it might take 40 minutes to build a single configuration. You might have additional build steps (like copying files to a network drive) which can take quite a while if you’re dealing with gigabytes of data.

As soon as a CI step takes to long, you lose the main benefit – the fast turn around of information.

For example, when we first start a project, our simple CI process will consist of the following steps

  • Detect modification
  • Build code
  • Build game assets
  • Copy to network drive
  • E-mail developers (who’s mailed depends on success or failure)

This is fine as a new project is tiny and the whole process doesn’t even take 5 minutes. And we need to build the whole thing constantly as we’re adding so much content the artists and designers need to be on the bleeding edge of what the programmers are creating.

But after a month (or probably less) this stops being suitable. A whole build might start taking 20 minutes then 30, an hour and then two hours, and if we leave it as is, the programmers are not getting the benefit of continuous turnaround and the designers spend ages waiting for a new build.

So what can we do about it?

The first thing is to look at what the CI process is doing, and exactly what we want to get out of it.

  • Continuous Build – We need the process to constantly compile the source, all configs, all platforms. This is so we can detect any compile errors quickly without having to build everything manually.
  • ‘Designer’ Builds – Creating an executable the designers, artists, animators etc. can get with the latest code changes. Ideally one they can request as required and one that is built as quickly as possible.
  • Full builds – A complete build of the game including executable and all in-game assets along with anything else needed to run the game.
  • QA Builds – QA could use the full build if needed, but this is an additional step which packages the build as it would be submitted allowing a better QA pass (DVD emulation, submission content etc.).

From my point of view, those are the four main things I want to get out of a CI process that having a single build stepl won’t give us. You might have other requirements and I’d certainly be interested in hearing what those are.

So what can we do to try and improve the initial process and still get what we want out of our CI machine?

Continuous Build
The first step is easy. We want a Continuous Build process with nothing to integrate, nothing to deploy and nothing to copy. This can be much quicker if we alter our repository modification checks to only monitor source code folders and not the entire repository.

For example if our repository contains scripts, configuration files or (shudder) game assets and executables we shouldn’t kick off a build if these changes as the CB results won’t be any different from last time.

We might also want to reduce the configs we’re interested in building (usually a debug and master build, the profiling builds might be skipped for speed reasons and due to them rarely being using). If we have a decent machine we might get the individual platforms (X360, PS3) to build in parallel as there will be no conflict between the temp files they generate – or even stick them on separate machines if we have the capacity.

The process only ever needs to notify on failure as no one is going to be using this build, it’s a sanity check pure and simple.

So already we have a much faster turn around time between check-in and ‘all clear’. In the past I’ve managed to reduce this from 60 minutes to 5.

‘Designer’ Builds
Initially it might be tempting to use the results of the ‘Continuous Build’ as the aim of this step is to provide the designers and artists with a new executable to take advatage of any new features (very) recently added.

This might be the right idea at the start, when the CB process is taking less than 5 minutes, but that doesn’t last, so we need a faster, more iterative, process to make sure our non-programmers are not hanging around waiting for the latest builds.

Most of the time, designers will use a ‘release’ build (‘release’ being a bit of a misnomer – it’s not releasable in any way, but it has just enough debug information to make it useful but still run at a ‘releaseable’ frame rate). So we only need to concentrate on a single configuration which means we can drastically cut down the length of time between when a modification is detected and a new usable build is generated.

As this is the fastest CI step we have and can often have the most people dependent on it, it’s the first one to run when a modification is detected.

In our case we don’t e-mail people when a new designer build is enabled. This is being built many times in an hour, and people will just end up sending the mails directly to the trash (I know I’ve done that on particularly spammy CI set ups). Simply allowing them to check the build status and update when it’s green works well enough.

Full builds
Developers generate a lot of content for games. Even small games can balloon in size depending on the scope and quality of the final product. As a result, we need a full integration build for a number of reasons

  • It would take every member of the team far to long to rebuild all the assets themselves
  • When a build gets out of sync with the assets, developers need a quick way to get everything back on track
  • When testing the build it needs to have been built on an independent build machine

A full build could be brute force (just builds everything every time) or smart (it concurrently builds executables and assets on multiple machines). This really depends on how long a full build would take. Less than an hour and I personally stick with a brute force approach but any longer and a more intelligent build step would be needed.

Full builds always e-mail the entire team, since it’s rare they happen. This allows people to get latest as they become available (usually at the start of the day) without having to check the status of the build.

QA Builds
The QA build is a special build. It doesn’t rebuild any assets or executables and is automatically kicked off when the daily build has finished successfully. This step packages the build up as it would be presented as part of a final submission along with any submission assets that would be required.

But why not just use the full build as the QA build to save time?

Simply put when testing a build it’s vital that we test under the same situation that our final submission will be tested under. Making sure we run under DVD emulation and use the same assets the manufacturer will use is an important part of the process. Having our CI machine generate these builds for us makes sure we’re doing this from the very start.

Requesting Builds
In every case we give all members of the development team the ability to request any build. If the ‘Designer Build’ is to far out of sync, they might need a full build to get them back on their feet. An asset change might alter the executable but not trigger a ‘Designer Build’ so a designer might need to trigger one manually.

In our case using CCTray allows us to do this very easily, so a new build can be requested by anyone (including QA) at any point of the day without any input from a programmer, allowing them to concentrate on making the game rather than just enabling others.

Technically anyone in the company could request and get a new build (very useful for getting demo builds together without getting the team involved) but I’ve not seen that happen yet.

What I Haven’t Covered
One major thing I’ve not covered here: self testing builds. This can range from simple unit tests running after every build to a scripted run through of the game after every full integration. The scope of this will very much depend on the size of the game and the time you have available to add them. Since this is a big topic in it’s own right, I’ve left that for another time.

Conclusion
So by simply reviewing what we actually want to get out of the Continuous Integration process we’re able to streamline it down to be much faster and much more useful. Complete integrations still happen (and happen when needed) but the common information needed by the team is generated quickly and allows teams to get short turn-around from the CI machine throughout the day.

I’d be very interested to know what other uses people are getting from their CI processes and how they are still making sure the speed and quick turnaround is happening all the time.

Title image by highgroove.  Used with permission.

I’m a strong believer in code reviews.  Having developers review and discuss code in development is one of the best ways to ensure good code quality and provides an excellent way to share domain knowledge with team members who may otherwise not get the opportunity.  It also allows experienced developers to share their knowledge and ideas with less experienced developers, an excellent way of distributing development skills across an entire team.

But review adoption doesn’t always stick.  I’ve seen the process taken up and then fizzle out on a number of teams usually for the same set of reasons.  The following are some of the reasons why a code review process can fail to take hold within a team.

Never Following Through
This is the biggest moral killer while trying to establish a code review process on a team.  What will generally happen here is that when code is put up for review, the author simply takes no notice of the comments or suggests and checks in the code without modification or without any follow up changes.  When other developers are taking the time to examine code and make positive suggestions they will quickly realise if they are simply being ignored.

And this kind of behaviour is both demoralising and infectious.

If developers see their comments being ignored they will very soon start to wonder why they even bother.  And if other developers don’t have to take the time to act of good suggestions, why should they.  Pretty soon you have one of the two following situations – people don’t even bother posting because nothing will come from it or you have review after review being raised and all of them being left uncommented and open for everyone to see.

Developers need to know they have the time to respond to feedback.  If they feel under pressure to move onto the next thing then they won’t have the desire (or ability) to actually respond to suggestions.  While not always possible, in the long term quality should take priority over speed, and if suggestions improve the code there should be an environment where reacting to these is important.

Reviews should always be classed as ‘finished’ when the feedback has been acknowledged and by tracking the number of open reviews it allows people to see what feedback still needs to be acted upon easily and allows people who are slipping to be picked up sooner rather than later.

Lack of Buy In
This is one of the hardest hurdles to overcome.  When a team simply doesn’t buy-in to the idea of code reviews, it will be difficult to get anything useful out of a review.  People won’t post reviews (often stubbornly refusing) or when they are the reviews will be superficial at best – finding very little of note and eventually killing the process through sheer apathy.

This can often be an attitude that is based on other problems in the team.  I have rarely met a programmer who didn’t want to talk about code, or who didn’t want to offer you a suggestion on how something could be improved or optimised.  The lack of desire to talk about their work, when it’s across the entire team, could be systematic of wider problems in the team.

But if the attitude of the team is generally positive and code reviews are the only sticking point, then it’s going to be less of problem trying to resolve these when trying to introduce a process into the team.

If there is a general malaise towards a process, strong leadership is one of the best ways to get other members of the team interested in a process.  Notice that I didn’t say strong management.  Leadership is a whole different ball game.

You can also start with those who are interested, keeping reviews small but public (they should rarely be private anyway).  As other members of the team see people in animated discussions on how something was implemented, it’s a unique programmer who doesn’t want to get involved at some level.

Unrealistic Expectations
I’ll go on record and say that you will rarely find bugs in code you review, especially if you are reviewing all or most of the code being written.  Or, at least you shouldn’t because your ‘bugs to code written’ ratio shouldn’t be high enough otherwise you have more serious problems.  But when a team is new to code reviews they can often have an expectation that issues will be found and they will be found often.  Which can cause a team to question what they are getting out of reviewing everything that’s being written.

But I’ll also go on record and say that code reviews can prevent bugs.

Code reviews benefit teams in the long term more than they do in the short term.  You might not be finding bugs on all your reviews but over time, if people are talking about code and sharing their experience, your team on the whole will become better programmers.  They’ll think that little bit longer before committing code if their peers are going to be looking at it, and they’ll have their bad habits smoothed out review after review.

And all this will lead to less bugs being added to the code base – but this is one of the hardest things to grasp when a team starts out and is looking for short term gain.

Peer Fear
This can manifest in a number of ways.

The first is through a fear of criticism where a developer is simply fearful of the criticism they will receive when putting their code up for review.  Unfortunately this isn’t always in the hands of the poster.  If a team has even one aggressive reviewer, this can effect the entire process, with people hesitant or even refusing to post their code, and when one reviewer starts to control the process, the ability to share knowledge decreases drastically.

The other side of the coin is where developers are worried about disagreeing with each other.  Comments on reviews are not always correct and not always the best course of action.  But if the team has developed a routine of automatically acting on suggestions (either from a subset of developers or all of them) then in effect any productive discussion is squashed and if people are afraid to say no, the code base isn’t always going to go in a good direction.

When this happens, the process can quickly lose steam.  If reviews cannot start a worthwhile discussion then the worth of reviews will decrease drastically.  If people see the code base is not improving (or even worse, degrading) they will not continue to post up reviews and the process will simply die out.

This is usually a management issue.  Domineering peers will destroy any process, but this can usually be resolved by taking that peer to one side, discussing the reasons behind the process and explaining how they can still be part of it while not taking over the discussion.  For those times where one developer is lacking the confidence to disagree, then it will often fall to other developers or the team lead to do the disagreeing for them, at least from the start.  As they start to see that disagreeing isn’t a negative trait in a review, they will start to feel more comfortable in reviews and discussions will start to flow.

Solutions?
Everyone likes a quick fix solution, but as with anything worth doing, it’s not always easy to solve these problems, especially one as difficult as a lack of buy in into a new process.  But I’ve generally found that most problems, like many team issues, are beaten by strong leadership and by being open about why the process is being introduced and how, in the long term, the developers lives will improve as a direct result of being part of it.

But to be honest, if you’re trying to solve these problems and you don’t buy into it yourself, then you’re onto a losing streak either way.  If you don’t believe in the process, then chances are you’re not to worried if it falls by the way side.  Which is fair enough.  Code reviews are not a mandatory part of software development and if the literature on the benefits of code reviews doesn’t persuade you, then nothing will.

Though I’d recommend you read them all again.

I was e-mailed by a co-worker the other week asking for advice on how to test a new feature that was implemented solely using private member functions and inaccessible objects within an existing system. It’s an interesting question that people often struggle with when trying to ‘unit test’ complex or high level systems when they are used to testing low level systems or classes where everything is accessible.

I use quotes around ‘unit testing’ because I think at times using the phrase unit testing becomes unhelpful when talking about system testing. I don’t know if it’s just the books I’ve read or the outlook of people I’ve discussed this with, but on the whole when someone says “I need to write a unit test to…” they almost always mean “I’ve just written something and now I must test the individual functions to make sure they all work”.

When testing a high level system, especially one that is being tested after the implementation is finished, people can struggle to take the processes they used to test lower level systems and apply them to more complex, interdependent objects.

Take the following unit test which checks that objects are being correctly destructed when a vector is cleared.

TEST(DestructorCalledOnClear)
{
  ftl::vector theVector(10);  // Reserve more than we’ll push
  TestClass tempObject;

  TestClass::m_destructorCall = 0;

  theVector.push_back(tempObject);
  theVector.push_back(tempObject);
  theVector.push_back(tempObject);

  theVector.clear();

  CHECK_EQUAL(3, TestClass::m_destructorCall);
}

This is a pretty standard looking unit test. We have some input and we’re testing the behaviour of the function we’re calling and it’s affect on the data passed to it. At this point we’re working at a pretty low level, so we have the opportunity to test components almost independently. Sort of.

These functions might be low level, but they are still built up of individual (and often inaccessible) components. All these functions are calling other functions we don’t have access to and we’re obviously not testing the individual lines of code, which will be generating multiple lines of ASM. We can accept this because we’re both used to it and we simply have to. We don’t try to test the internal implementation because we know it works based on the input and generated output of the tested functions.

We have other tests which check the individual elements of this test (push_back() adding elements and clear() reducing it’s size to 0), and basing our more complex test on these allows us to concentrate on the more complex behaviour (the destruction of objects inside the vector). It’s this mindset of testing independent elements and basing more complex tests on simpler tests that needs to be expanded as the systems become more complex and more inaccessible.

Take the following use case as an example.

A title can only communicate with a the game save system using messages. It has no access to the components of the game save system at all, and can only request processes (like asking for data to be saved) and wait for a result message (like the results of a save operation). The system has the following behaviour which we want to test – when a user sends some data to be saved, if the data is the same as what was previously saved the save doesn’t take place at all.

Implementation wise, this is done completely through the use of private functions so nothing other than the internal object itself has access to the data and how it behaves. So what’s the approach here, because we certainly don’t have the opportunity to run what people would call ‘standard unit tests’ on this system as it stands.

The first question to ask is if the system itself is structured in the best possible way. For example, we have an object which is both carrying out the logic of saving but also the management of the data itself. As a result the data management aspect is hidden deep inside the system, making it both impossible to access and impossible to test but knowing the data is managed correctly is crucial to testing the whole system.

So the first approach would be to extract the data management element into it’s own structure. Having an independent object responsible for caching the data and performing any operations on the it (in this case checking if the new data is the same as the old) not only improves maintainability (under the single responsibility principle), it also allows us to test the specific data management step of any save process independently before plugging it into the system as a whole.

Now we’ve extracted what should be an independent element of the system (and directly tested the same data checks process), we can look at the rest of the system. We already have tests for the low level save API, so all we now need to test the user side behaviour.

We know the following behaviour should exist for the user
* When a user requests a save, the data, if different, is saved
* If the data is the same the system reports ‘success’ at the point the save was requested
* If the data is saved the system reports success after the save has completed

So based on this we can now test the system as a whole, knowing that we’ve extracted the independent systems into their own lower level unit tests with these tests concentrating on the high level behaviour of the save process as a whole.
* Requesting a save doesn’t return a success until the data has been saved. We can tell this via the content of the result message and the time it takes to arrive
* When requesting another save with the same data, the result should be instantaneous – also indicating success – which will only happen if saving doesn’t occur
* When following that with changed data, the save successfully completed and it doesn’t return instantaneously

Because we’ve extracted the data management structure into its own test, we can safely assume that element of the system works and it’s the high level, end user experience, that we’re testing in exactly same way in which a user would use it.

#define protected public
Opening up the system is often suggested as a way of testing the ‘internals’ of an object but the more testing deviates from how people use a system the less the tests can be relied on in a real life situation. It’s all good and well a test passing independently of everything else, but if it falls over when used by a client then testing has failed.

Changing protected or private to public (with a method of your choosing) only increases the difference between test usage or real-world usage and quite often internal structures or algorithms may well not work independently, leaving you to replicate what the object would be doing anyway which can easily lead to over-mocking your tests.

So it opens up the question of how much your system is doing and how much of that work could be extracted into more independent objects that are both easier to test and (more importantly) easier to maintain. In these cases making the system more modular or changing how the data is handled will improve both testing and code use without compromising encapsulation and use case testing.

In the last few posts, I’ve covered the development of a type safe and easy to use flag set, and as a result it now contains 3 required template parameters, two of which declare default values.

template <
          typename TEnum,
          int TMaxFlags = 8,
          typename TNames = union_names<bit_set_size<TMaxFlags>::Value>
         >
class flag_set

The last two properties have default values, but to be honest, people don’t want to have to add a Noof option to the declaration just to define the flag names.  And they don’t want to keep adding long type declarations, even when using typedef’s to cut down on the clutter.

We have the following coding standard at work which can provide an interesting structure defining both our flags and flag names

Enums should be defined within a structure to provide proper name scoping

struct MyProperty
{
  enum Enum
  {
    FlagEntry1,
    FlagEntry2,
  };
};
MyProperty::Enum property = MyProperty::FlagEntry1;

In the past enums were defined with the name of the enum appended to the names of its entries. While this does reduce name conflicts it is a poor way to scope a property.

Since we can guarantee that our enums are defined like the above, we can make the assumption that people will (or will be happy to) add their flag names as part of this structure.

// Structure defining the flag enums and also
// defining the flag names which we need
struct StateFlags
{
  // Flags
  enum Enum
  {
    HasTarget,
    CarryingWeapon,
    Drugged,

    Noof,
  };

  // Names
  struct Names
  {
    uint HasTarget:1;
    uint CarryingWeapon:1;
    uint Drugged:1;
  };
};

Making that assumption allows us to define the following structure

template <typename TFlags>
struct flag_set_ex
{
  typedef flag_set <
                     typename TFlags::Enum,
                     TFlags::Noof,
                     typename TFlags::Names
                   > flags;
};

This then allows users to define their flags sets quickly using all the available properties

flag_set_ex<StateFlags>::flags myStateFlags;

If programmers don’t want to use the flag names, they can obviously use the base flag set type. And to keep things consistent, we can add the a ‘::flags’ typedef to the base flag set allowing the definitions to be the same regardless of the flag set type being used.

template < typename EnumT, uint SizeT, typename NamesT >
class flag_set
{
public:
  typedef flag_set<EnumT, SizeT, NamesT>     flags;
};

// Define a normal flag set with the same syntax as the flag_set_ex making
// it much easier to interchange the types depending on their needs
flag_set<StateFlags::Enums>::flags myStateFlags;

In the previous post I described how I implemented a type safe, template driven, system for declaring and using flags in C++. But one of the main issues with the solution was not being able to see what flags were set when debugging.

Which Flags Are Set?
When setting bits to store flags there are varying levels of output you can get from them in the debugger.

One of them absolutely sucks – you don’t want to be doing bit arithmetic to know whats been set

The next one is better, it gives you some idea of what’s been set, but not much.

Finally, there is the following, which is much better.

A normal way of getting this behaviour out of the debugger when using base type flags is declaring them alongside a structure containing the individual flag entries inside a union.

struct  StateFlags
{
  union
  {
    uint m_allFlags;
    struct
    {
      uint HasTarget:1;
      uint CarryingWeapon:1;
      uint Drugged:1;
    } m_names;
  };
};

This provides you with the best possible output in the debugger so it’s something I wanted to add to the flag set to make debugging much easier.

Flag Set ‘Names’
Since this is an interface driven implementation the ideal syntax starts out as the following

// The enums used to define the flag set
enum StateFlags
{
  HasTarget,
  CarryingWeapon,
  Drugged,
};

// The ‘names’ we’d like to be displayed in the debugger
struct StateNames
{
  uint HasTarget;
  uint CarryingWeapon;
  uint Drugged;
};

We need to be able to pass these names through to the container, and since we want the type defined on a per flag basis we can add these names to the flag set definition.

template <typename TEnum, int TMaxFlags, typename TNames>
class flag_set

Since we already have our own defined type to store the flags being set and we know that we can use a union to combine the storage type and the flag names, we simply append our name type along side the storage type so when a flag is set the name is automatically set alongside it.

template< int TMaxFlags, typename TNames >
struct bit_union
{
  // Container type with enough bits to store TMaxFlags flags
  typedef bit_set<TMaxFlags> container_type;
  typedef TNames name_type;

  union
  {
    container_type m_type;
    name_type m_name;
  };
};

I’ve extracted the bit_set into it’s own structure rather than including it directly inside the union for two reasons

  • It allows the bit set to be specialised based on the number of flags without worrying about the union. If we had a large number of specilised storage structures, we’d have to duplicate the union inside each one.
  • It allows us to strip out the union and the names in builds that don’t need it, without affecting the speclialised bit sets.

So the flag set now contains user defined names without having to worry about how they are implemented.  The templated utility functions used to set, remove and check the flags also don’t need to know anything about it so aren’t effected.

template < typename TEnum, int TMaxFlags, typename TNames >
class flag_set
{
public:
  void set(const TEnum flag)
  {
    // The utility functions still take the same
    // type regardless of the name union
    set_bit_entry(m_unionFlags.m_type.m_bitArray, flag, true)
  }

private:
  static const uint BitSetSize = ((( TMaxFlags + 7 ) & ~7 ) >> 3)*8;
  bit_union< BitSetSize, TNames > m_unionFlags;
};

Now we can simply define our flag set with a unique set of names and in the debugger we get exactly what we we’re looking for (annoyingly we can’t get around the need to define the integers in the structure as one bit entries if we’re going to add them to the union).

struct MyFlags
{
  enum Enum
  {
    UnderAttack,
    InCover,
    // etc.

    Noof,
  };

  struct Names
  {
    uint UnderAttack:1;
    uint InCover:1;
    // etc.
  };
};

flag_set<MyFlags::Enum, MyFlags::Noof, MyFlags::Names> flagSet;

flagSet.set(MyFlags::InFormation);
flagSet.set(MyFlags::KnockedDown);

Default Names
But we have one problem.  We don’t want to force people to have flag names if they don’t want them. Depending on the type and volatility of the flags they might not need the names in the debugger and it can be a lot of code to add for every single flag set in use.  And since we have a default template parameter for the number of flags, we need to provide a similar default parameter for the names.

Defining the default parameters isn’t much of a problem.  We can quickly define union name companions for 8+ flags (a few small macros make this much easier) but how to drop them into the flag set template as a multiple of 8?

Since we already calculate our multiple of 8 flag count when defining the bit_union type, we can do the same when declaring the default name structure.  But we certainly don’t want such an over-complicated calculation twice in the class.

Since we need the multiple of 8 value outside the class in the template declaration, we can’t just declare it inside the class, so by storing it in it’s own type we can use it where-ever we need it.

template<uint TSize>
struct bit_set_size
{
  enum
  {
    Value = ((( TSize + 7 ) & ~7 ) >> 3)*8,
  };
};

We can now use this to declare our default flag set name where union names is correctly specialised depending on how many flags the user has declared.

template <
           typename TEnum,
           int TMaxFlags = 8,
           typename TNames = union_names<bit_set_size<TMaxFlags>::Value>
         >
class flag_set
{
private:
  bit_union< bit_set_size<TMaxFlags>::Value > m_unionFlags;
};

So now a user can simply declare their flag set with the enum type and by default they will get something out of the debugger that’s more useful than just showing the single base type value.

flag_set<MyFlags::Enum> flagSet;

flagSet.set(MyFlags::InFormation);
flagSet.set(MyFlags::KnockedDown);

Flag Set Drawbacks
In adding custom names to the flag set, we’ve added a number of new templated structures.  The default name types, the bit_set_size and the bit_union and all this comes with a cost.  Obviously we’re generating a number of structures for all the flags that use this system and it certainly does effect compilation speed.

Based on compilation speeds of a couple of projects that use the flag set heavily over a few months, I’d estimate that adding the flag set (before the debug names were added) put an extra 20-30 seconds on a full rebuild.  After adding the debug names, it probably added another 30 seconds or more, bringing a rebuild of the entire project up from 3 minutes to around 4.

But I can live with that for a number of reasons

  • The flag set has introduced a more type safe environment for developers.  On a project that has a number of developers dipping into it, this is vital and does reduce the possibility of human errors causing simple but difficult to track down bugs
  • The code is clearer and easier to read, maintain and debug.  This is a bonus for all the same reasons above.
  • I rarely rebuild the entire project.  A well structured project reduces the number of rebuilds greatly and as such the longer rebuilds actually don’t effect me that much (an average build usually takes <10 seconds)

There’s one more thing I want to look in regards to the flag set, and that’s how we can take the basic template declaration and simplify it so users can quickly and easily declare a flag set with a large number of flags and their own flag names within a single structure.

But I’ll cover that in another post.

As was pointed out in the last post, I… ahem… forgot to actually put down some working examples of the flag set, so I ended up describing something but not actually showing off the final product.

So I thought a good start would be to include a few of the flag set unit tests (since unit tested code is known to be good ‘working’ documentation in itself) followed by a simple working example based on the examples used at the start of the last post.

So, here’s a load of code that you can feast you eyes on (the unit test library being used here is UnitTest++)

struct TestFlagSet_Small
{
  enum Enum
  {
    Flag1,
    Flag2,
    Flag3,
    Flag4,

    Noof,
  };
};

struct TestFlagSet_Large
{
  enum Enum
  {
    Flag1,
    Flag2,
    Flag3,
    Flag4,
    Flag5,
    Flag6,
    Flag7,
    Flag8,
    Flag9,
    Flag10,
    Flag11,
    Flag12,

    Noof,
  };
};

class ExampleFixture
{
public:
  // Define a couple of flags, one defaulting to 8 flags
  flag_set<TestFlagSet_Small::Enum> smallFlagSet;
  flag_set<TestFlagSet_Large::Enum, TestFlagSet_Large::Noof> largeFlagSet;

  ExampleFixture(void)
  {
  }

  ~ExampleFixture(void)
  {
  }
};

// Checks that the flag set starts in an empty state
TEST_FIXTURE(ExampleFixture, InitialFlagSet_Empty)
{
  CHECK_EQUAL(smallFlagSet.is_clear(), TRUE);
  CHECK_EQUAL(largeFlagSet.is_clear(), TRUE);
}

// Tests that compilation works when defining flags with sizes < 8
TEST(DefaultTemplateSet)
{
  flag_set<TestFlagSet_Small::Enum, TestFlagSet_Small::Noof>    smallFlagSet;
  flag_set<TestFlagSet_Large::Enum, TestFlagSet_Large::Noof>    largeFlagSet;
}

// Checks if the flag set is clear when flag set removed
TEST_FIXTURE(ExampleFixture, SetThenRemoveIsClear)
{
  smallFlagSet.set(TestFlagSet_Small::Flag1);
  smallFlagSet.remove(TestFlagSet_Small::Flag1);

  CHECK_EQUAL(smallFlagSet.is_clear(), TRUE);
}

// Checks if the flag set is clear when cleared or no flags are set
TEST_FIXTURE(ExampleFixture, ClearSetIsClear)
{
  smallFlagSet.set(TestFlagSet_Small::Flag1);
  smallFlagSet.clear();

  CHECK_EQUAL(smallFlagSet.is_clear(), TRUE);
}

// Checks if the set function correctly sets the right flags
TEST_FIXTURE(ExampleFixture, SetFlagSetIsSet_IsSet)
{
  smallFlagSet.set(TestFlagSet_Small::Flag1);
  smallFlagSet.set(TestFlagSet_Small::Flag4);

  CHECK_EQUAL(smallFlagSet.is_set(TestFlagSet_Small::Flag1), TRUE);
  CHECK_EQUAL(smallFlagSet.is_set(TestFlagSet_Small::Flag4), TRUE);
}

// Check that returned C type is correctly set
TEST_FIXTURE(ExampleFixture, AsCTypeFunction)
{
  smallFlagSet.set(TestFlagSet_Small::Flag1);
  smallFlagSet.set(TestFlagSet_Small::Flag2);
  smallFlagSet.set(TestFlagSet_Small::Flag4);

  // as_c_type allows the flag set to be explicitly passed
  // through to legacy functions and for the underlying
  // datatype to be stored as a base type is needed
  const uint8* flagSetBits = smallFlagSet.as_c_type();

  CHECK( (*flagSetBits) & (1<<TestFlagSet_Small::Flag1) );
  CHECK( (*flagSetBits) & (1<<TestFlagSet_Small::Flag2) );
  CHECK( (*flagSetBits) & (1<<TestFlagSet_Small::Flag3) == FALSE );
  CHECK( (*flagSetBits) & (1<<TestFlagSet_Small::Flag4) );
}

// Test that setting from a C type correctly sets the flags
TEST_FIXTURE(ExampleFixture, FromCTypeFunction)
{
  smallFlagSet.set(TestFlagSet_Small::Flag1);
  smallFlagSet.set(TestFlagSet_Small::Flag2);
  smallFlagSet.set(TestFlagSet_Small::Flag4);

  const uint8 tempStorage = *(smallFlagSet.as_c_type());

  smallFlagSet.clear();
  smallFlagSet.from_c_type(&tempStorage);

  CHECK_EQUAL(smallFlagSet.is_set(TestFlagSet_Small::Flag1), FALSE);
  CHECK_EQUAL(smallFlagSet.is_set(TestFlagSet_Small::Flag2), FALSE);
  CHECK_EQUAL(smallFlagSet.is_set(TestFlagSet_Small::Flag4), FALSE);
}

// And a (very) simple real world example

// Define a couple of flag types
struct BattleProperties
{
  enum Enum
  {
    HasTarget,
    CarryingWeapon,
  };
};

struct CharacterState
{
  enum Enum
  {
    IsStunned,
    IsPoisoned,
  };
};

// Create a flag set for the character states
typedef flag_set<CharacterState::Enum> StateFlags;
StateFlags m_currentStateFlags;

void UpdateCharacter(void)
{
  // We’ve been stunned
  m_currentStateFlags.set(CharacterState::IsStunned);

  // ... do some more stuff

  // Whoops, I would have just made a mistake, but I can’t do this
  // m_currentStateFlags.set(BattleProperties::CarryingWeapon);

  // Update what ever has happened
  UpdateState(m_currentStateFlags);

  // We’ve dealt with our states now
  m_currentStateFlags.clear();
}

void UpdateState(const StateFlags& state)
{
  // Are we stunned?
  if (state.is_set(CharacterState::IsStunned)
  {
  }

  // Another mistake I can no longer do
  // if (state.is_set(BattleProperties::CarryingWeapon)
  // {
  // }
}