29 September 2013

Thinking about undo/redo

    Hi there, this time I want to share some thought about how one can provide undo/redo support in his application. First of all, let's try to figure out what undo or redo means from programmer's perspective. There can be two approaches. First, the classes responsible for the application or a part of it always have some state (usually this is the set of data members), and undo is just a transition from one state to the previous one, and redo, respectively, from a state to the next one. Do not confuse the next one with something that will happen in the future, because redo only becomes available after at least one undo, so redo itself can be viewed as an undo after another undo. Anyway, everyone knows this part fine, because this is supported by most of the applications, especially by those providing a graphical user interface.
    The other case is when the importance is handed to the actions and not to the objects owning the state, so for example on the Document object the insertion (typing) or deletion of a text can be undoable, but for the same Document the resize operation can be not undoable at all. As you can see, the notion of state is not so relevant here, because having two consecutive states we cannot say for sure (for some cases we can, but this will take some more effort which is redundant) which action or what group of actions is responsible for that transition. So this is the second approach - the action itself may be undoable or not.
    What is interesting, if we think further, the action cannot handle such a big responsibility itself. Let's consider the same example, how can the delete action (button, or a menu item) undo itself? Undo means taking back all the changes that were caused by the action. These changes were caused to the Document, more precisely to the state of Document. It means delete has so much access to Document that it can change its state (some string member for example, or a shape object). To be able to undo these changes, delete action should have similar level of access to another behavior for restoring the Document's state. Thus, we again end up with the notion of state, but there is still a huge difference between two approaches. Let me give some real-world examples before I proceed to details.
    Suppose there is a gallery in the center of a city and masterpieces of famous artists are presented there. And you've been visiting the gallery very often during last 5 years, so you have seen all the works, and there is nothing new for you. But one day there is an announcement of an exhibition of a new young artist. You go and see all his paintings or drawings instead of the old usual ones. But after five days, the exhibition ends and the masterpieces are again hung in their usual places. So this is an example of the first approach. The state (original masterpieces) was saved somewhere in the basement, and then it was restored.
    For the second case, suppose your car is parked near the curb in some 20 meters from the traffic light. Then you drive forward 10 meters, and you remember that you've forgotten something, and you put the gear on rear and drive back these 10 meters and park at the same exact place. So now in terms of position your car is in the same place as before, so we can say that the drive action was undone, but the state of the car is not the same, the oil in the engine was still and now you just turned the ignition off and the oil has not manage to come down yet, the engine is a little warmer, the fuel in the tank is a little lesser, and so on.

Let me give two more examples, software related, and then we can go ahead. For first case - imagine you are filling some online form to send to the immigration services, but the form is too long, so there is a button called 'Save Draft' which saves the current state of the form, and then you can close your browser, come back any time later, log in to the same page and continue from where you stopped. In this example, the state is just the set of values of form fields you are filling (most likely all the fields are unique, so the values cannot be derived from each other). You can say "where is undo here". Undo is when not actually a supported operation for this case, but if you save the draft then continue filling the form, and suddenly the electricity goes off, after that you open the browser and actually the changes are undone to the previous state.
    For the second case - imagine a huge 3d graphical editing tool like Autocad or CorelDraw, and you start stretching some line, but your hand is not accurate enough and you move the mouse a little more than necessary, so the line becomes longer than you need. There are thousands of other objects in the same document you are editing, so you click undo, and the stretch action is undone. It is easier to stretch the single line back, than to store all the thousands of objects to restore the state.
    OK, enough fairy tales, now let's use some programming. If you don't mind I will start with the last two examples. So now I want to support the save draft action for the online form. I think everyone can do that:

class OnlineForm
{
public:
       void load(); // loads the form with the saved data if available
       void saveDraft() const; // saves the current state of the form
       void submit() const; // submits the form data to the server
       void clear(); // clears all the fields of the form
private:
       // form data
       int id;
       int phone;
       string name;
       string address;
       // ... and so on

};

saveDraft can save the filled data in an .xml file, for example, or in a database, and load will read and fill the form from the same place. Now if we want to support undo we need to be able to save the state as many times as needed and store these snapshots separately. And if we know the snapshot corresponding to current state, then undo is only a call to load with the previous snapshot, and redo is a call to load with the next snapshot provided as an argument. It is, of course, preferable to have two member functions getSnapshot and setSnapshot (or, which is more conventional, getState and setState) which will respectively retrieve and set the state of the form. Besides this, we better encapsulate the state into another class, this will keep things more decoupled and the function prototypes more brief. Let's call this class FormData. As I already mentioned load should be replaced by setState, which will accept an instance of FormData, so that the original source of our data (.xml, database, etc.) is not hard-coded in the OnlineForm class, but is defined as a serialization/deserialization of FormData. This will also make it easy to change the source, simply introducing a new serializer class for FormData.
    Also, we don't need saveDraft() anymore, we need getState() which will return the current state of the form within a FormData instance. So now our code will look like this:

struct FormData
{
       // form data
       int id;
       int phone;
       string name;
       string address;
       // ... and so on
};

class OnlineForm
{
public:
       FormData getState() const; // replaces saveDraft with
                                  // a better functionality
       void setState(const FormData& newState); // restores the state
       void submit() const; // submits the form data to the server
       void clear(); // clears all the fields of the form
private:
       FormData state;
};

    To support undo and redo for the OnlineForm, we need to have some convention on what we assume an undoable action. It can be a single character append in a textbox, or it can be a single field update (textbox, checkbox, combobox, etc.), or it can even be an update of a group of related fields (Personal info, job details, address, payment method, etc.). After each of this conventional action(s) we will call the getState and store the state in some place, and we always should keep track of the current state.

class OnlineForm
{
public:
       FormData getState() const
       {
              return states[currentStateIndex];
       }

       void setFirstName(const string& name)
       {
              FormData newState = getState();
              newState.name = name;
              setState(newState);
       }

       void undo()
       {
              if (currentStateIndex > 0)
              {
                     --currentStateIndex;
                     // probably, we should have some function call here,
                     // to refresh the form view, so that the correct data
                     // appears in the fields. And the "correct" data is
                     // the one retrieved by getState() call.
              } else {
                     // We are in the initial step and cannot undo,
                     // we can throw an exception, or for a better UX
                     // we can just disable undo button at all when
                     // this condition (else case) becomes effective.
              }
       }
       void redo(); // almost the same code as for undo().

private:
       void setState(const FormData& newState)
       {
              states.push_back(newState);
              ++currentStateIndex;
       }

       vector<FormData> states;
       int currentStateIndex;

};

I moved setState() declaration to the private section, because any intervention from outside would break the undo logic, now each separate action, like setFirstName will call setState. So generally, we have the undo/redo logic working, but within a bad implementation, most likely we should have an interface with undo() and redo() pure virtual functions, and it can be a generic one of the state object type - in our example FormData. Also, the OnlineForm itself may not be responsible for storing the states - instances of FormData, so we may need the third class responsible for the storage. If you remember this comes to the memento pattern described in GOF Design Patterns book. One more good thing we can do, make undo/redo accept an integer of how many steps we want to go back/forth. So there are many details which mainly depend on the exact application we write. But the general idea is the one already described:
  • have a strictly defined state, which most likely does not contain all the data members of the class in the question (in our example state does not have currentStateIndex, etc.);
  • after each undoable step, make another snapshot of your object, i.e., store a new instance of state;
  • track the current state in your list of states so that you can go back and forth whenever needed (needed here means, for example, pressing undo/redo button on the UI).
So what bad things can arise here, what are the pitfalls of this approach? Suppose the state does not consist of some primitive objects like int or string (string also can be dangerous, the reason is coming soon), but it has some internal data structures, for example an object graph. We cannot easily copy the whole state every time we do some undoable action, because, first, the state can be huge in size, and keeping many instances of state will fill the memory very soon. Second, we need a deep copy for storing the state, we cannot just keep the address of the root node of the graph and restore it from that address. In some cases, we can implement some kind of simple compression, for example, we can store only the difference between previous and current states, not the whole state, more precisely, store only the fields which have been changed. But this will introduce some more complexities in code. Also this still does not solve the deep copy problem. That's when the second approach of implementing undo/redo comes to a stage.
    In second approach, as I have already mentioned, we do not need to keep any state, but the actions that we want to be undoable, have to have some more logic. Let's take the same example of a graphical editor. There are several actions like stretch a line, fill the shape, etc. First, each action has some execute() function, which incorporates the logic of the action itself. Now we want to add an undo() function, which will work at least so well as execute does. For simplicity I will not focus on making the code generic, so I will present some specific actions for specific targets (always referring to the examples already brought up). Here are our actions for stretching a line and filling the shape with some color:

class StretchAction
{
public:
       void setLength(double length);
       void execute(Line line);
private:
       double length;
};

class FillAction
{
public:
       void setColor(Color color);
       void execute(Shape shape);
private:
       Color color;
};

Let's think about how to support undo. We need to keep the history of executed actions, so that each time we press undo in the editor it retrieves the current item and calls undo() on it. This means that each action should have some small amount of history also, to know how to revert things back. For stretch action we can have the previous (one that was before stretching) position of the line, uniquely defined by its endpoints, for example. For the fill action, we have no other way than keeping the old color. So we can change our actions a little bit and get the following version:

class StretchAction
{
public:
       void setLength(double length);
       void execute(Line line);
       void undo(Line line)
       {
              line.moveTo(oldP1, oldP2);
       }
private:
       double length;
       Point oldP1, oldP2;
};

class FillAction
{
public:
       void setColor(Color color);
       void execute(Shape shape);
       void undo(Shape shape)
       {
              shape.fill(oldColor);
       }
private:
       Color color;
       Color oldColor;
};

You probably have many questions now, and I will try to answer all of them, well, all that I can think of.
  • How do we know which Line or Shape object to pass to undo()? - I assume that the worksheet itself supports this functionality, e.g., current selection, active object, etc. If not, we could make the action to execute on the worksheet itself and also remember the Line object also which it operated on. But in this case we would have to worry about some synchronization stuff, for example when we delete a shape from the worksheet and then undo, the address of the new object is not necessarily the same, so we should think of some complicated way of referring to the worksheet objects.
  • How do we know that the object acted upon will always support the operations implied by undo? - As we can see, doing something and undoing that change do not always invoke the same function of the object acted upon. But most likely if we are the author of undo/redo support, we also have a change access to the source code for the classes for which we are implementing that support. I mean we can add the missing pieces of behavior.
  • What do we store in the undo history? - We can store instances of the actions themselves, though we would need to make an abstract base Action (see Command design pattern) which would be the parameter type of the history container. In case of my example I hardly can find a more efficient way of retaining the history, because we would still need to know the original and target values (color or pair of endpoints) and we still would need to know which action was applied, so at least some 4 bytes again. But this is exactly the case now - we have the values and vptr (I am considering that you are not reluctant and have already created the base abstract Action) for each instance of class which is again 4 bytes (for 32-bit systems).
  • How to implement redo() then? - Well, redo() will be just a single call of execute() on the same object.
  • Who should keep the history? - Most probably the editor, or if it supports multiple worksheets, it may be the worksheet, it depends on your business logic, so this is not always the same.
For some actions you may not even need to store additional data for undo support. Remember my example of driving the car. So if you keep the distance passed during driving, then driving back will need exactly the same distance, nothing more (well you may keep the route also, to know when and where to turn, but this is again the data required for driving forward, so no additional cost for backing up).
    This was the idea of the second approach, each action is responsible for reverting the changes done by itself. Both of these approaches share some common thing - they keep the old state to be able to restore it later. This is actually the essence of undo, without knowing what was before, we cannot go back to that state, and no one can do that, so do not think of any new ways of implementing undo, without having information about the state which you want to go back to.

I did not bring up the complete code and even not a well-defined piece of code, it was just some mock-up to make the comprehension easier and faster. And I put the title "Thinking about..." not "Implementing...", so don't blame me. I believe there are still open questions, but I don't have much time to cover everything, and everything in my understanding differs from what you think of it. Please post your questions in the comments, and I will be happy to answer. Thanks for reading.