Introduction
The Equation parser is a program written in C# to parse the mathematical formulas like “12^3+log(3.45)”, the program can also solve more complex formulas like “-(5 – 10)^(-1) ( 3 + 2(cos( 3 Pi )+( 2+ ln( exp(1) ) ) ^3))”. It reads the whole formula and gives the final result. I have used the .Net Regex or the Regular Expressions to solve the formulas (refer to section 4 to read more about Regular Expressions). I needed this type of formula parser when I was doing program for a school. The program should take a formula and substitute other numbers in the formula to get other results. So we will have two questions in two separate exams for the same subject with different numbers. This helped in reducing cheat in the exams. The main interface of the program will be as shown if Fig. 1
The parser algorithm
The program contains a class called “Parser” this class is main class of the program and contains the parser work. The parser parses the formula in the following algorithmic steps
- It removes the blank spaces; blank spaces will prohibit the parser to do the exact calculations, so it is better to remove it from the original expression. As a sample the formula “-(5 – 10)^(-1) ( 3 + 2( cos( 3 Pi )+( 2+ ln( exp(1) ) ) ^3))” will be changed to “-(5-10)^(-1)(3+2(cos(3Pi)+(2+ln(exp(1)))^3))”
Expression = Expression.Replace(" ", "");
- Sometimes it is necessary to add the * in multiplication, because the expression 5y means 5 multiplied by y. As an example the formula -(5-10)^(-1)(3+2(cos(3Pi)+(2+ln(exp(1)))^3)) -> -(5-10)^(-1)*(3+2*(cos(3*Pi)+(2+ln(exp(1)))^3)) will be changed to -(5-10)^(-1)*(3+2*(cos(3*Pi)+(2+ln(exp(1)))^3)) -> -(5-10)^(-1)*(3+2*(cos(3*Pi)+(2+ln(exp(1)))^3))
Regex regEx = new Regex(@"(?<=[d)])(?=[a-df-z(])|(?<=pi)(?=[^+-*/\^!)])|(?<=))(?=d)|(?<=[^/*+-])(?=exp)", RegexOptions.IgnoreCase);
- Replace the mathematical constant PI with 3.14
regEx = new Regex("pi", RegexOptions.IgnoreCase); Expression = regEx.Replace(Expression, Math.Round(Math.PI, rounding_number).ToString());
- Search for the parentheses and solve the expressions between them. The expression is cut into segments and the parsing is done for every segment
regEx = new Regex(@"([a-z]*)(([^()]+))(^|!?)", RegexOptions.IgnoreCase); Match m = regEx.Match(Expression); while (m.Success) { if (m.Groups[3].Value.Length > 0) Expression = Expression.Replace(m.Value, "{" + m.Groups[1].Value + this.Solve(m.Groups[2].Value) + "}" + m.Groups[3].Value); else Expression = Expression.Replace(m.Value, m.Groups[1].Value + this.Solve(m.Groups[2].Value)); m = regEx.Match(Expression); }
- Repeat 4 until no parentheses, if so solve the expression by using the C# Math structure.
this.strValue = Convert.ToString(this.Solve(Expression));
Step 5 should be detailed, the function Solve(Expression) is the main function of the Parser class
- 5.1 We search for expressions containing cos, sin, tan, etc.
Regex regEx = new Regex(@"([a-z]{2,})([+-]?d+.?,*d*[eE][+-]*d+.?|[+-]?d+.?,*d*)", RegexOptions.IgnoreCase); Match m = regEx.Match(Expression); while (m.Success && this.FunctionList.IndexOf(m.Groups[1].Value.ToLower()) > -1) { switch (m.Groups[1].Value.ToLower()) { case "abs": Expression = Expression.Replace(m.Groups[0].Value, Math.Abs(Convert.ToDouble(m.Groups[2].Value)).ToString()); m = regEx.Match(Expression); continue; case "acos": Expression = Expression.Replace(m.Groups[0].Value, Math.Round(Math.Acos(this.Factor * Convert.ToDouble(m.Groups[2].Value)), rounding_number).ToString()); m = regEx.Match(Expression); continue; case "asin": Expression = Expression.Replace(m.Groups[0].Value, Math.Round(Math.Asin(this.Factor * Convert.ToDouble(m.Groups[2].Value)), rounding_number).ToString()); m = regEx.Match(Expression); continue; case "atan": Expression = Expression.Replace(m.Groups[0].Value, Math.Round(Math.Atan(this.Factor * Convert.ToDouble(m.Groups[2].Value)), rounding_number).ToString()); m = regEx.Match(Expression); continue; case "cos": Expression = Expression.Replace(m.Groups[0].Value, Math.Round(Math.Cos(this.Factor * Convert.ToDouble(m.Groups[2].Value)), rounding_number).ToString()); m = regEx.Match(Expression); continue; case "ceil": Expression = Expression.Replace(m.Groups[0].Value, Math.Ceiling(Convert.ToDouble(m.Groups[2].Value)).ToString()); m = regEx.Match(Expression); continue; case "cosh": Expression = Expression.Replace(m.Groups[0].Value, Math.Round(Math.Cosh(this.Factor * Convert.ToDouble(m.Groups[2].Value)), rounding_number).ToString()); m = regEx.Match(Expression); continue; case "exp": Expression = Expression.Replace(m.Groups[0].Value, Math.Round(Math.Exp(Convert.ToDouble(m.Groups[2].Value)), rounding_number).ToString()); m = regEx.Match(Expression); continue; case "floor": Expression = Expression.Replace(m.Groups[0].Value, Math.Floor(Convert.ToDouble(m.Groups[2].Value)).ToString()); m = regEx.Match(Expression); continue; case "ln": Expression = Expression.Replace(m.Groups[0].Value, Math.Round(Math.Log(Convert.ToDouble(m.Groups[2].Value), Math.Exp(1.0)), rounding_number).ToString()); m = regEx.Match(Expression); continue; case "log": Expression = Expression.Replace(m.Groups[0].Value, Math.Round(Math.Log10(Convert.ToDouble(m.Groups[2].Value)), rounding_number).ToString()); m = regEx.Match(Expression); continue; case "sign": Expression = Expression.Replace(m.Groups[0].Value,Math.Sign(Convert.ToDouble(m.Groups[2].Value)).ToString()); m = regEx.Match(Expression); continue; case "sin": Expression = Expression.Replace(m.Groups[0].Value, Math.Round(Math.Sin(this.Factor * Convert.ToDouble(m.Groups[2].Value)), rounding_number).ToString()); m = regEx.Match(Expression); continue; case "sinh": Expression = Expression.Replace(m.Groups[0].Value, Math.Round(Math.Sinh(this.Factor * Convert.ToDouble(m.Groups[2].Value)), rounding_number).ToString()); m = regEx.Match(Expression); continue; case "sqrt": Expression = Expression.Replace(m.Groups[0].Value, Math.Round(Math.Sqrt(Convert.ToDouble(m.Groups[2].Value)),rounding_number).ToString()); m = regEx.Match(Expression); continue; case "tan": Expression = Expression.Replace(m.Groups[0].Value, Math.Round(Math.Tan(this.Factor * Convert.ToDouble(m.Groups[2].Value)),rounding_number).ToString()); m = regEx.Match(Expression); continue; case "tanh": Expression = Expression.Replace(m.Groups[0].Value, Math.Round(Math.Tanh(this.Factor * Convert.ToDouble(m.Groups[2].Value)), rounding_number).ToString()); m = regEx.Match(Expression); continue; } }
- Solve for factorial
regEx = new Regex(@"{(.+)}!"); // Search for patterns like {5}! m = regEx.Match(Expression); while (m.Success) { double n = Convert.ToDouble(m.Groups[1].Value); if ((n < 0) && (n != Math.Round(n))) throw new Exception(); // Value negative or not integer -> throw exception Expression = regEx.Replace(Expression, this.Fact(Convert.ToDouble(m.Groups[1].Value)).ToString(), 1); m = regEx.Match(Expression); } regEx = new Regex(@"(d+,*d*[eE][+-]?d+|d+,*d*)!"); // Search for patterns like 5! m = regEx.Match(Expression); while (m.Success) { double n = Convert.ToDouble(m.Groups[1].Value); if ((n < 0) && (n != Math.Round(n))) throw new Exception(); // Value negative or not integer -> throw exception Expression = regEx.Replace(Expression, this.Fact(Convert.ToDouble(m.Groups[1].Value)).ToString(), 1); m = regEx.Match(Expression); }
- Solve for division and multiplication
regEx = new Regex(@"([+-]?d+.?,*d*[eE][+-]?d+.?|[-+]?d+.?,*d*)([/*])(-?d+.?,*d*[eE][+-]?d+.?|-?d+.?,*d*)"); m = regEx.Match(Expression, 0); while (m.Success) { double result; switch (m.Groups[2].Value) { case "*": result = Math.Round(Convert.ToDouble(m.Groups[1].Value)* Convert.ToDouble(m.Groups[3].Value),rounding_number); if ((result < 0) || (m.Index == 0)) Expression = regEx.Replace(Expression, result.ToString(), 1); else Expression = Expression.Replace(m.Value, "+" + result); m = regEx.Match(Expression); continue; case "/": result = Math.Round(Convert.ToDouble(m.Groups[1].Value) / Convert.ToDouble(m.Groups[3].Value),rounding_number); if ((result < 0) || (m.Index == 0)) Expression = regEx.Replace(Expression, result.ToString(), 1); else Expression = regEx.Replace(Expression, "+" + result, 1); m = regEx.Match(Expression); continue; } }
- Solve for addition and subtraction
regEx = new Regex(@"([+-]?d+.?,*d*[eE][+-]?d+.?|[+-]?d+.?,*d*)([+-])(-?d+,*d*[eE][+-]?d+.?|-?d+.?,*d*)"); m = regEx.Match(Expression, 0); while (m.Success) { double result; switch (m.Groups[2].Value) { case "+": result = Math.Round(Convert.ToDouble(m.Groups[1].Value) +Convert.ToDouble(m.Groups[3].Value),rounding_number); if ((result < 0) || (m.Index == 0)) Expression = regEx.Replace(Expression, result.ToString(), 1); else Expression = regEx.Replace(Expression, "+" + result, 1); m = regEx.Match(Expression); continue; case "-": result = Math.Round(Convert.ToDouble(m.Groups[1].Value) - Convert.ToDouble(m.Groups[3].Value),rounding_number); if ((result < 0) || (m.Index == 0)) Expression = regEx.Replace(Expression, result.ToString(), 1); else Expression = regEx.Replace(Expression, "+" + result, 1); m = regEx.Match(Expression); continue; } }
- 5.1 We search for expressions containing cos, sin, tan, etc.
The program capability:
The program can solve any expression containing the following operands
- ^ : for the power
- *,/: multiplication and division
- -,+: subtraction and addition
- !: Factorial of a real positive number
In addition the program can also solve for the mathematical expressions, like cos, sin, log, etc. Table 1 contains a list of the mathematical functions that the program can solve
Table1: The list of mathematical functions provided by our parser
Function | Description | Examples |
sin | Sine: sin(v), where v is the value of angle in radians. | sin(3.14), sin(x) |
cos | Cosine: cos(v), where v is the value of angle in radians. | cos(3.14), cos(x) |
tan | Tangent: tan(v), where v is the value of angle in radians. | tan(3.14), tan(x) |
sinh | Hyperbolic Sine: sinh(v), where v is the value of hyperbolic angle in radians. | sinh(3.14), sinh(x) |
cosh | Hyperbolic Cosine: cosh(v), where v is the value of hyperbolic angle in radians. | cosh(3.14), cosh(x) |
tanh | Hyperbolic Tangent: tanh(v), where v is the value of hyperbolic angle in radians. | tanh(3.14), tanh(x) |
asin | Arcsine: asin(v), where v is a value in [-1,+1] range. | asin(-0.5), asin(x) |
acos | Arcsine: acos(v), where v is a value in [-1,+1] range. | acos(-0.5), acos(x) |
atan | Arctangent: atan(v), where v is a value in [-1,+1] range. | atan(-0.5), atan(x) |
log | log(v) returns the logarithm base 10 of value v. | log(125.2), log(x) |
ln | ln(v) returns the logarithm base e of value v. | ln(125.2), ln(x) |
exp | exp(v) returns the value of e raised to power of v. | exp(4.5), exp(x) |
abs | abs(v) returns the absolute value of v. | abs(-4.5), abs(x) |
ceil | ceil(v) returns the rounded up integer of decimal value v. | ceil(5.01) -> returns 6.0 |
floor | floor(v) returns the rounded down integer of decimal value v. | floor(5.99) -> returns 5.0 |
Regular Expressions
Regular expressions are a pattern matching standard for string parsing and replacement. They are used on a wide range of platforms and programming environments. Originally missing in Visual Basic, regular expressions are now available for most VB and VBA versions. Regular expressions, or regexes for short, are a way to match text with patterns. They are a powerful way to find and replace strings that take a defined format. For example, regular expressions can be used to parse dates, urls and email addresses, log files, configuration files, command line switches or programming scripts. Since regexes are language independent, we’re trying to keep this article as language independent as possible. However, it’s to be noted that not all regex implementations are the same. The below text is based on Perl 5.0. This is also the format that RegExpr for VB/VBA uses. Some implementations may not handle all expressions the same way.
Regex syntax
In its simplest form, a regular expression is a string of symbols to match “as is”.
Regex | Matches |
abc | abcabcabc |
234 | 12345 |
That’s not very impressive yet. But you can see that regexes match the first case found, once, anywhere in the input string.
Quantifiers
So what if you want to match sevearal characters? You need to use a quantifier. The most important quantifiers are
*?+
. They may look familiar to you from, say, the dir statement of DOS, but they’re not exactly the same.
* matches any number of what’s before it, from zero to infinity.
? matches zero or one.
+ matches one or more.
Regex | Matches |
23*4 | 1245, 1 2345, 123345 |
23?4 | 1245, 12345 |
23+4 | 12345, 123345 |
By default, regexes are greedy. They take as many characters as possible. In the next example, you can see that the regex matches as many 2’s as there are.
Regex | Matches |
2* | 122223 |
There is also stingy matching available that matches as few characters as possible, but let’s leave it this time. There are also more quantifiers than those mentioned.
Special characters
A lot of special characters are available for regex building. Here are some of the more usual ones.
. | The dot matches any single character. |
n | Matches a newline character (or CR+LF combination). |
t | Matches a tab (ASCII 9). |
d | Matches a digit [0-9]. |
D | Matches a non-digit.
|
w | Matches an alphanumberic character. |
W | Matches a non-alphanumberic character. |
s | Matches a whitespace character. |
S | Matches a non-whitespace character. |
Use to escape special characters. For example, . matches a dot, and \ matches a backslash. | |
^ | Match at the beginning of the input string. |
$ | Match at the end of the input string. |
Here are some likely uses for the special characters.
Regex | Matches |
1.3 | 123,1z3,133 |
1.*3 | 13,123,1zdfkj3 |
dd | 01,02, , .. 99 |
w+@w+ | a@a, email@company.com |
and
^$
are important to regexes. Without them, regexes match anywhere in the input. With ^
and $
you can make sure to match only a full string, the beginning of the input, or the end of the input.
Regex | Matches | Does not match |
^1.*3$ | 13, 123, 1zdfkj3 | x13, 123x, x1zdfkj3x |
^dd | 01abc | a01abc |
dd$ | xyz01 | xyz01 |
Character classes
You can group characters by putting them between square brackets. This way, any character in the class will match one character in the input.
[abc] | Match any of a, b, and c. |
[a-z] | Match any character between a and z. (ASCII order) |
[^abc] | A caret ^ at the beginning indicates “not”. In this case, match anything other than a, b, or c. |
[+*?.] | Most special characters have no meaning inside the square brackets. This expression matches any of +, *, ? or the dot. |
Here are some sample uses.
Regex | Matches | Does not match |
[^ab] | c, d, z | ab |
^[1-9][0-9]*$ | Any positive integer | Zero, negative or decimal numbers |
[0-9]*[,.]?[0-9]+ | .1, 1, 1.2, 100,000 | 12. |
Grouping and alternatives
It’s often necessary to group things together with parentheses ( and ).
Regex | Matches | Does not match |
(ab)+ | ab, abab, ababab | aabb |
(aa|bb)+ | aa, bbaa, aabbaaaa | abab |
Notice the |
operator. This is the Or operator that takes any of the alternatives. With parentheses, you can also define subexpressions to remember after the match has happened. In the below example, the string what is between (.)
Regex | Matches | Stores |
a(d+)a | a12a | 12 |
(d+).(d+) | 1.2 | 1 and 2 |
In these examples, what is matched by (d+) gets stored. The regex engine will allow you to retrieve the stored value by a successive call. The implementation of the call varies. InRegExpr for VB/VBA, you call RegExprResult(1) to get the first stored value, RegExprResult(2) to get the second one, and so on. This way you can retrieve fields for further processing.
The source code
Attached is the source code of the parser, you can use the parser class by including it in your program. The code has been implemented by using C# Visual studio.Net 2008.