Tuesday, November 1, 2011

Demonstration: Atomic Access and Interrupt Routines

This is a post I made a while back on the Arduino Forums.  It has since been buried, so I figured I'd repost it here so I have easier access to it in the future.

Had an idea this morning for what I thought would be an interesting little demonstration of why Atomic access is necessary when working with variables that are used in Interrupt routines.

Demonstration code:
#include "TimerOne.h"
#include <util/atomic.h>

volatile unsigned int test = 0xFF00; 
void setup()
{
  
  Serial.begin(115200);
  Timer1.initialize(10000); //Call interrupt every 10ms
  Timer1.attachInterrupt(myInterrupt);
  Timer1.start();

}
 
void myInterrupt()
{
  //Interrupt just toggles the value of test.  Either 0x00FF or 0xFF00
  static bool alt = true;
  if(alt) test = 0x00FF;
  else    test = 0xFF00;
  alt = !alt;
}
 
void loop()
{
  unsigned int local;
  //ATOMIC_BLOCK(ATOMIC_RESTORESTATE)
  {
     local = test;
  }
  
  //Test value of local.  Should only ever be 0x00FF or 0xFF00
  if(!(local == 0xFF00 || local == 0x00FF))
  {
    //local value incorrect due to nonatomic access of test
    Serial.println(local, HEX);
  }
}

If you download and run this code as is, unmodified, you will see intermittent outputs on the Serial port of 'FFFF' and '0'.

If the variable test is only ever assigned the values of 0x00FF or 0xFF00, then why is the value of our local variable occasionally 0x0000 or 0xFFFF?

The reason is because the assignment of the 16 bit value stored in our variable test to our variable local is not what we call atomic.  What do I mean by atomic?  Essentially it means that the assignment takes multiple cycles for the microcontroller is performed.  This is because the microcontroller is an 8-bit processor, so it only handles data in 8 bit chunks. When we are working with 16 bit data, the processor still only handles that data 8 bits at a time, so will always take multiple cycles to perform even the simplest operations on that data.

Because it takes multiple cycles even to copy data from one variable to another, there is the possibility that the interrupt will occur in the middle of that assignment.  When this happens, the processor stops what it is currently doing, switches to processing the ISR, and then returns back to where it left off.  The processor won't care if the ISR changed the value of it's variable while it was in the middle of assigning that value to another variable, so we get these situations where it assignments half of one value to our local variable, gets interrupted by the ISR, and then returns to copy the other half of the second value to our local variable.  And now our local variable contains invalid data.

The solution to this problem is to force the compiler to perform the operation without allowing any interruptions to occur.  There are several methods of doing this, but the basic requirement is that interrupts be disabled prior to performing the operation that requires atomicity, thus ensuring that the entire operation is completed without interruption.

If you uncomment the one line in my demonstration code: ATOMIC_BLOCK(ATOMIC_RESTORESTATE)
the code will then run indefinitely without any output to the Serial port.  ATOMIC_BLOCK() is a macro provided by the AVR LibC library that provides the ability to designate a block of code to be performed atomically (ie, without interruption).

It is similar to using sei() and cli(), or Arduino's wrappers interrupts() nointerrupts() with a couple potentially key differences.  With the parameter ATOMIC_RESTORESTATE passed in, it doesn't simply re-enable interrupts upon exiting the atomic block.  It restores it to it's state previous to entering the atomic block.  In this simple demonstration code, the difference may not be apparent.

In more complicated code, with multiple paths of execution and potentially convoluted execution of various functions, the state of interrupts won't necessarily be known upon executing an ATOMIC_BLOCK(), so it may be necessary to restore the previous state as opposed to arbitrarily re-enabling interrupts.  Perhaps they were already disabled prior to entering the atomic block.

The other benefit is that you are forced to define a block of code to make Atomic, by way of coding braces ({ and }).  What this means is that it's not possible to forget to restore your interrupt state.  Again, in more complicated code, if you use say nointerrupts(), it's possible you could forget to call interrupts() to reenable them, and then have to deal with debugging why your interrupts aren't working. If it is code that is called intermittently, or under certain/rare conditions, it can be very hard to debug.

With ATOMIC_BLOCK() there's still the possibility that you've got too much code in your block, or not all the code that needs to be atomic, but it will always either re-enable interrupts, or restore them to the state they were in prior to the atomic block.  If you forget to close the block, you get a compile error.

Thursday, June 30, 2011

Parsing:  Quick guide to strtok()

This post is sort of a continuation of my Serial Comm guide.

There are a variety of ways to parse C strings, all with their pros and cons.  One method uses the C library function strtok().  It's advantages are less coding on your part to parse a string.  A bit more robust than some other methods (without additional coding on your part as well).  It's disadvantages are:  it mangles the C string it parses.  Binary size may be a bit larger than other methods (important if you're running low on flash space, but the difference will be on the order of about 1500 bytes).

First, we assume you have some C string with data in it you need to parse out.  I'll be using an example from my Serial Comm guide, the Sparkfun Razor IMU output string, which will look something like:

!ANG:7,320,90

Here we have three angle values we are trying to parse out into a useable format, like an int.  To do that, we'll be using another C library function as well: atoi().  This function converts a C string into an int.  You pass it a char pointer to the portion of the string you want converted to an int, and it returns the int value of what you passed it.  It's usage will look something like:

void setup(){
  char instring[] = "!ANG:7,320,90";
  char* valPosition = instring + 5;
  int value = atoi(valPosition);
  Serial.begin(115200);
  Serial.println(value);

   
}

This is a contrived example that is of little real use, but it does show the usage of atoi().  We will now use strtok() to break up our string into it's discrete tokens, and use atoi() to convert those token values to ints. strtok takes two parameters and returns a char* to the next token in the string.

The first parameter is the string you want to tokenize.  It's important to note that you only pass strtok a pointer to this string once.  It retains this pointer internally on subsequent calls, and returns a pointer to the next token in the string, or NULL when it reaches the end of the string.  The second parameter is a list of delimiting characters.
  For our Razor IMU, these delimiter would be the exclamation point, semicolon, and comma.  Some example usage:
 
void setup(){
  char instring[] = "!ANG:7,320,90";
  char delimiters[] = "!:,";
  char* valPosition;
 
  //This initializes strtok with our string to tokenize
  valPosition = strtok(instring, delimiters);
 
  Serial.begin(115200);
 
  while(valPosition != NULL){
    Serial.println(valPosition);
    //Here we pass in a NULL value, which tells strtok to continue working with the previous string
    valPosition = strtok(NULL, delimiters);
  }
 
}
Using strtok is typically a two step process.  The first call to strtok is our initialization call.  We pass it in the string we want to tokenize, and it passes back a pointer to the first token.  It also does some stuff internally for the second step of the process.

This second step is typically a loop of some sort that repeatedly calls strtok.  In this case, we check to see if our return value is NULL or not.  If it's NULL, strtok has finished tokenizing our string.  If it isn't NULL, we make another call to strtok.

This code will provide the following output:

ANG
7
320
90

So the next step to do is utilize atoi to convert our tokens into actual int values.  Let's just start with a couple additional lines of code:

void setup(){
  char instring[] = "!ANG:7,320,90";
  char delimiters[] = "!:,";
  char* valPosition;
 
  valPosition = strtok(instring, delimiters);
  int angle;
 
  Serial.begin(115200);
 
  while(valPosition != NULL){
    angle = atoi(valPosition);
    Serial.println(angle);
    valPosition = strtok(NULL, delimiters);
  }
 
}

Here we declare an int variable called angle to hold our return value from atoi, and we make a call to atoi() in our while loop.  We are also printing out our new angle value.  The output of this code is similar to our previous code:


0
7
320
90

Our angle values are all correct, but the first line is the result of calling atoi() on a non-numeric string.  We don't really want to convert our ANG token.  It isn't a numeric value so atoi just returns zero.  We have two options here.  The first is to change our code to ignore the first token in our string.  Another is to change our input string to get rid of ANG altogether.  If you recall from the Serial Comm guide, I talked about being able to use either ! or : for the start character with the Razor IMU.  It made little difference to the serial comm code, but here it can simplify things a bit.  So let's look at some code that goes that route:

void setup(){
  char instring[] = ":7,320,90";
  char delimiters[] = "!:,";
  char* valPosition;
 
  valPosition = strtok(instring, delimiters);
  int angle;
 
  Serial.begin(115200);
 
  while(valPosition != NULL){
    Serial.println(angle);
    angle = atoi(valPosition);
    valPosition = strtok(NULL, delimiters);
  }
 
}

All we've done is change our input string to get rid of the ANG token, which never changes anyways, so is of no real value.  This code is currently only printing out the value of angle, and each previous value is lost.  What we really want is to record all three values, presumably to perform some calculations on afterwards.  The best way to handle this is with an array of ints, something like this:

void setup(){
  char instring[] = ":7,320,90";
  char delimiters[] = "!:,";
  char* valPosition;
 
  valPosition = strtok(instring, delimiters);
  int angle[] = {0, 0, 0};
 
  Serial.begin(115200);
 
  for(int i = 0; i < 3; i++){
    angle[i] = atoi(valPosition);
    Serial.println(angle[i]);
    valPosition = strtok(NULL, delimiters);
  }

}

First change is our declaration of angle.  We now declare it as an array by suffixing it with [], and initialize it to three elements all equal to zero.

We then replace our while loop with a for loop.  Our input string is of a known, specific format, with three elements, so we loop 3 times, storing each successive value into the next element of our array.  When it's all done, we have all three of our values stored in our array.  A couple of notes here though.  Because our array was declared inside serup(), once setup() exits, our array is gone.  Also, it wouldn't be a bad idea to add some code to check for a NULL value returned from strtok().  The code will still work fine without it.  atoi() will return a zero when passed a NULL value.  If you're going to utilize these values for additional calculations though, you'll probably want to completely disgard anything that didn't fully parse properly to avoid processing garbage data.


A Note on converting strings to other types:
atoi() is a useful function for converting a numeric string into an int, and there are other library functions available for some of the other types as well.
atof() can be used to convert float values (it technically returns a double, but doubles and floats are the same on 8bit AVRs anyways)
atol() returns a long value.

These are all part of the AVR Libc package, which provides most of the standard C library functions for the AVR 8bit micros, and more details can be found at their homepage here:

AVR Libc Home Page

Friday, May 6, 2011

Serial Comm Fundamentals on Arduino

First things first, let's lay the foundation for why things have to be done a certain way when writing Serial communication code on the Arduino:
  • This guide covers how to handle human readable serial data.  Other forms of serial data are outside the scope of this guide.
  • Not all code samples posted will be fully functional by themselves. They are only meant as iterative examples to illustrate specific aspects of serial communications leading up to the final and fully functional routine.
  • This guide covers writing code to read serial data coming in on the serial port.  It does not cover how to physically connect to external devices.

Serial speeds vs Arduino speeds

How fast is serial data transmitted?
Serial communications @ 115200 baud (baud equals bits per second)
10 bits per character (115200,8,N,1) = 11,520 characters per second.
This equals 1 character every 86.8 microseconds.

How fast does the Arduino run?
Arduino runs at 16Mhz (mhz = million hz, or million cycles per second)
This is one cycle/instruction every 62.5 nanoseconds (1 microsecond = 1000 nanoseconds).

So, in the time it takes to transmit a single character over the serial line at 115200 baud, the Arduino running at 16Mhz will execute ~1388 instructions.  At 57600 baud, the Arduino will execute 2777 instructions.  At 9600 baud, the Arduino will execute over 16,000 instructions in the time it takes to transmit one character.

What this means is that proper Serial processing on the Arduino requires some form of synchronization with the incoming data so you know when you have all the data to be processed.  There are a variety of ways to accomplish this.  The method this tutorial is going to cover uses what's called delimiting characters.  These characters will be arbitrarily chosen based on the data we're transmitting and how we need to handle it.

ASCII
American Standard Code for Information Interchange
ASCII Table

ASCII is the format used to transmit human readable data over the serial line.  Each byte of data represents a 'character' in the ASCII table. A numeric value in ASCII form is not the same as it's value as say, an int.  The character '1' and the numeric value 1 are not the same.  If you look up the character '1' in an ASCII chart, you will find that it's decimal value is 49.

int i = '1';
Serial.println(i);
i = 1;
Serial.println(i);

Serial output will be:
49
1


  Characters can be divided into two general categories, printable and non-printable characters.  Non-printable characters can also be referred to as control characters.  Carriage Return is an example of a non-printable character.  It is ASCII code 13 in decimal, but it has no 'printable' representation, though it is often referred to as CR.

One thing that is important to know is how to specify these non-printable 'control' characters within your Arduino code.  There are several ways of doing this.  You can use standard escape sequences in your character strings, or you can specify the non-char value in Dec or Hex form:

The following three lines of code all create a char variable named C, and assign it the Carriage Return value.
char c = '\r'; //Use the backslash 'escape character' followed by the 'control' character for Carriage Return
char c = 13; //Use the decimal value for Carriage Return
char c = 0xD; //Use the hexidecimal value for Carriage Return


Delimiters
What is a delimiter?  A delimiter is one or more characters used to specify the boundary between chunks of data in a block of data
For our purposes, we are looking for three delimiters here. 
  • We are looking for a header or start character, that uniquely identifies the start of a data string. 
  • We are looking for a terminating or stop character that uniquely identifies the end of a data string.
  • We are looking for a field or data delimiter that uniquely separates each discrete piece of data in the string.  This delimiter is not involved with reading in the data string itself, but is necessary when it comes to parsing out the individual data values after the entire string has been read in.


These delimiters will allow our Arduino to know exactly when a data string begins and ends, and how to separate and parse the individual chunks of data within the string.


When choosing our delimiting characters, we have to look at the data we are transmitting and select characters that can be uniquely identified from the data characters 100% of the time.  Printable characters have the advantage of being more easily human readable, but in some instances may not be suitable.  For example, if the data we are receiving can contain any human readable characters, then we can't reliably use human readable characters for delimiting.  We would have to resort to using some non-printable characters.  Typically though, the data you are reading into the Arduino will come from some external sensor that transmits that data in a specific format that you will be unable to change.  So let's look at a couple real world examples.  The first will be a GPS module and the second will be an IMU (two external serial devices I frequently see questions on).

The GPS module will be this one: EM-406A SiRF 3 GPS
The IMU will be this one: Sparkfun Razor 9DOF IMU

Now, I am well aware of the TinyGPS library that is available for interfacing with any NMEA compliant GPS module.  I am only using the GPS module as an example of how to look at serial data coming from a sensor and determine what characters to use for delimiting that data.

Some example data strings from the EM-406A (taken out of it's User Manual) are:
$GPGGA,161229.487,3723.2475,N,12158.3416,W,1,07,1.0,9.0,M,,,,0000*18
$GPGLL,3723.2475,N,12158.3416,W,161229.487,A*2C
$GPGSA,A,3,07,02,26,27,09,04,15,,,,,,1.8,1.0,1.5*33

Example data strings from the Razor IMU:
!ANG:320,33,191
!ANG:0,320,90
!ANG:0,0,0

For the GPS module, it is clear from the manual that each string starts with a $ character.  It also appears that the $ character is never used anywhere else in the string.  This is actually part of the NMEA standard for their protocol headers and is intended to be used as the start character for parsing a GPS data string.

Each EM-406A output string is also terminated with a carriage return and line feed.  Line feed is another non-printable character, ascii code 10, that is typically used in conjuction with carriage return as a line terminator.  Since neither the carriage return or line feed is used anywhere else in the data string, we can use either for our terminating character (and with properly written code, it won't matter which we use).

It also becomes clear that each piece of data is separated by a comma.  This will be used as our data delimiting character (and is also a pretty standard character to use for this purpose).

For the Razor IMU, the apparent candidate for the start character is the !.  An alternative start character would be the semicolon character since The !ANG portion is static, ie never changes.  Either one would suffice and have little impact on the code itself.

The Razor strings are also terminated with CR/LF, so we'll use CR for our terminating character as well.

And yet again, the Razour separates the three angles with commas as well.

So those would be our delimiting characters for reading in the data from those two devices.  For the GPS we'd use $, CR, and comma.  For the Razor we'd use ! (or semicolon), CR, and comma.

A note on strings
The Arduino provides two methods of storing character data.  You have the standard C style character arrays, and you have Arduino's own attempt at a C++ String class.  C style character arrays require a bit more effort on the programmers part while Arduino's String class attempts to provide an easier to use object that handles most of the string manipulation under the hood.  There is one potentially significant pitfall with the String class that you need to be aware of when using it.  The class relies on dynamically allocating buffers to handle different string sizes as well as changes in string sizes.  With the Arduino's limited SRAM, repeated dynamic memory allocation will fragment memory and eventually cause the Arduino to behave unpredictably or lock up completely.  It is for this reason that I utilize C style character arrays.  I'm not saying the String class can't be used, but steps need to be taken to keep it from fragmenting memory at which point you lose much of the advantage the class tries to provide.

Reading in Serial data
Now that we have our delimiters specified, let's have a brief overview of how we're going to read in our full data string.  We'll start this with an overview of the Serial class methods (methods are function associated with a class) we'll be using to read in the data string.

Serial.available():     This method returns the number of characters that are available in the Serial buffer at the moment it is called.

Serial.read():      This method returns a single character from the Serial buffer, removing it from the buffer as well.  If there is no data available in the buffer, it returns -1.

And that's it.  These are the only two Serial methods we'll need to read in our data string.  So how do we read in that data string.  There are a couple hurdles we need to overcome to accomplish this. 

Recall at the beginning of this guide that the Arduino will run thousands of cycles in the time it takes to receive a single character.  That is our first hurdle.  Also recall that the goal of this guide is to provide a method that is robust, flexible, and efficient. This precludes any use of delay().  As a general rule, any Serial code that uses delay() is neither robust, flexible, or efficient.  Just don't do it.

So then, what is a robust, flexible, and efficient method of reading in Serial code? A method that regularly checks to see if serial data is available, reads the data that's available, and can reliably determine when it has a complete string of data to process.  So let's start turning that into code:

if(Serial.available()>0){
    char incomingbyte = Serial.read();
    buffer[index++] = incomingbyte;
}

The above code is by no means complete, but it's a start.  It checks to see if there are any characters available from the serial port, and if there is, it reads one in, and puts it in a char buffer.  It'll only read a single character though, even if there are more than one.  If this code is executed frequently though (and it would be if located inside the Arduino's loop() function, it will still read in serial data as it is available (keep in mind the Arduino will run thousands of cycles between each serial character).  Let's modify it slightly to read in all available characters anyways:

while(Serial.available()>0){
    char incomingbyte = Serial.read();
    buffer[index++] = incomingbyte;
}

The above code basically turns our previous code into:  While serial data is available, read it into our buffer.  However, we don't want to put just any data into our buffer.  We want to put a complete string of data into our buffer, and our strings all begin with a start/header character.  So we need to check our incoming data and only put it into our buffer when we see the start character.  Something like this:


char startChar = '$'; // or '!', or whatever your start character is
boolean storeString = false; //This will be our flag to put the data in our buffer
while(Serial.available()>0){
    char incomingbyte = Serial.read();
    if(incomingbyte==startChar){
        index = 0;  //Initialize our index variable
        storeString = true;
    }
    if(storeString){
        buffer[index++] = incomingbyte;
    }
}

This code utilizes a boolean variable as a flag to indicate what to do with incomingbyte.  The first thing we do with incomingbyte is check to see if it's our startChar.  If it is, we set storeString true and set our index to 0.  When storeString is true, we store incoming data into our buffer.  If it's false, we do nothing with the character.  Now we need to add code to determine when we've reached the end of our string, which requires looking for our terminating character.  It's at that point that we now have a complete data string and can then parse it and do whatever is necessary with the parsed data.


char startChar = '$'; // or '!', or whatever your start character is
char endChar = '\r';
boolean storeString = false; //This will be our flag to put the data in our buffer
while(Serial.available()>0){
    char incomingbyte = Serial.read();
    if(incomingbyte==startChar){
        index = 0;  //Initialize our index variable
        storeString = true;
    }
    if(storeString){
        if(incomingbyte==endChar){
            buffer[index] = 0; //null terminate the C string
            //Our data string is complete.  Parse it here
            storeString = false;
        }
        else{
            buffer[index++] = incomingbyte;
        }
    }
}

A second check has now been added for the endChar before storing it in our buffer.  When this second check is true, we now have a complete data string in our buffer.  This string can now be parsed to extract the specific pieces of data we want to use, but there are some additional improvements that we can make to our Serial code before we move on to parsing our data string.

Making our code more robust and flexible

One variable that hasn't been explicitly declared in our sample code so far is our buffer array.  This is an important detail that needs to be covered, but like our delimiter characters, there is no one single answer for all solutions.  Obviously our buffer size has to be large enough to contain our data string (plus one character for a null terminator).  So the thing that needs to be determined is how large can/will our data string be?  If we look back at our GPS data strings, we see a lot of variation (and a lot of repeat commas with no data).  Initially it may look as if determining our buffer size could be a bit of a challenge for these strings, but also recall that I mentioned these strings are compliant with an NMEA controlled specification.  As part of that spec, no NMEA compliant data string can be longer than 80 characters.  So this means our buffer does not need to be larger than 81 characters.

If we look at our Razor data strings, we see that the largest is 15 characters long.  It is important to look at the actual values though, and how large they can be.  In the case of the Razor, it is returning 3 angle values, each ranging from 0-360ยบ.  So the maximum character length of each value is 3 characters.  Notice in our longest example one of the values is only two characters long.  So that means our maximum Razor string size is actually 16 characters, but we need to include our non-printable delimiter character as well, so our buffer size should be no smaller than 18 characters..

So let's add in our buffer declaration:

//buffer size for NMEA compliant GPS string
//For Razor, set value to 17 instead
#define DATABUFFERSIZE      80
char dataBuffer[DATABUFFERSIZE+1]; //Add 1 for NULL terminator
byte dataBufferIndex = 0;

Here we use a define to specify the size of our buffer, and then use that define to declare a buffer of the appropriate size.  You may also notice I've changed the name of the buffer.  Our previous examples were using a rather ambiguous name (though the new ones are only a bite less ambiguous).  I've also included the declaration of our index variable, but again with a more descriptive name.  The define serves two purposes here.  First, it'll improve the flexibility of our code.  We can easily modify the size of our data buffer from one location, and while at this moment it may seem that we can accomplish the same by modifying the size of the declared array directly, we will be adding more code that relies on the size of the buffer, and will utilize this define in other locations.  So, let us do that now...

When it comes to serial communications, it is not always safe to assume perfect communications.  In fact, it's rarely safe to do so.  Especially if you are dealing with some form of wireless intermediary.  So what does this mean for our Serial code?  What would end up happening if we happened to drop a block of data containing our terminating and start delimiters?  The code as it stands would just continue to read in the next data string, and continue to stash it in our buffer.  The problem is, our buffer is only large enough to store a single data string.  Without any code to protect against this scenario though, we will continue writing to memory beyond our buffer, and bad things are likely to happen once we start doing that.  So let's put a check in to make sure we never write outside our buffer.

while(Serial.available()>0){
    char incomingbyte = Serial.read();
    if(incomingbyte==startChar){
        dataBufferIndex = 0;  //Initialize our dataBufferIndex variable
        storeString = true;
    }
    if(storeString){
        //Let's check our index here, and abort if we're outside our buffer size
        //We use our define here so our buffer size can be easily modified
        if(dataBufferIndex==DATABUFFERSIZE){
            //Oops, our index is pointing to an array element outside our buffer.
            dataBufferIndex = 0;
            break;
        }
        if(incomingbyte==endChar){
            //Our data string is complete.  Parse it here
            storeString = false;
        }
        else{
            dataBuffer[dataBufferIndex++] = incomingbyte;
            dataBuffer[dataBufferIndex] = 0; //null terminate the C string
        }
    }
}

So now we compare our index value to our buffer size.  C style arrays are zero based arrays, meaning the first index in the array is index 0, which means the last index in the array is index (array size - 1).  Thus, if our index equals our array size, we've exceeded the boundary of our array.  When this happens, we reset our index, and break out of the while loop.  This means we are basically throwing away what we've already read in, but this data is now of unknown integrity.  The reality is this code should almost never be executed, but it's best to have it there to avoid a potential infrequent lockup problem in the future.  There are other options for handling a buffer overflow condition than this one, but this is simple and effective.

The next improvement we can make to this code is to wrap it up into a function call.  This will improve the flexibility and reuseability of the code.  To do this, we will have the function return a boolean value that indicates whether a complete string is ready to parse or not.

boolean getSerialString(){
    static byte dataBufferIndex = 0;
    while(Serial.available()>0){
        char incomingbyte = Serial.read();
        if(incomingbyte==startChar){
            dataBufferIndex = 0;  //Initialize our dataBufferIndex variable
            storeString = true;
        }
        if(storeString){
            //Let's check our index here, and abort if we're outside our buffer size
            //We use our define here so our buffer size can be easily modified
            if(dataBufferIndex==DATABUFFERSIZE){
                //Oops, our index is pointing to an array element outside our buffer.
                dataBufferIndex = 0;
                break;
            }
            if(incomingbyte==endChar){
                dataBuffer[dataBufferIndex] = 0; //null terminate the C string
                //Our data string is complete.  return true
                return true;
            }
            else{
                dataBuffer[dataBufferIndex++] = incomingbyte;
                dataBuffer[dataBufferIndex] = 0; //null terminate the C string
            }
        }
        else{
        }
    }
   
    //We've read in all the available Serial data, and don't have a valid string yet, so return false
    return false;
}

No significant modifications here.  Just put the code into a function called getSerialString(), that returns a boolean value.  Since dataBufferIndex and storeString are only used by our Serial read code in this function, I've also moved their declarations into the function.  Declaring them static means the values will be retained between separate calls to getSerialString().  We have only two points that we return from, one returning true (string is available), and the other returning false (a complete string is not available yet).  The usage of this function is simple:

if(getSerialString()){
    //String available for parsing.  Parse it here
}

That code would go someplace where it gets called at regular intervals, for example right at the start of your loop().  And there you have your serial input routine.  All you need to do is specify a start and terminating delimiter, and the maximum data string size to read in virtually any ASCII based data strings.