Massassi Forums Logo

This is the static archive of the Massassi Forums. The forums are closed indefinitely. Thanks for all the memories!

You can also download Super Old Archived Message Boards from when Massassi first started.

"View" counts are as of the day the forums were archived, and will no longer increase.

ForumsDiscussion Forum → MySQL running sum problem
MySQL running sum problem
2007-10-25, 6:47 PM #1
I'm trying to do a running sum. Here's the query I have so far:

Code:
SELECT 
	f.fin_id AS `id`, 
	f.fin_text AS `What`, 
	concat('$', f.fin_amt) AS `Amount`, 
	DATE_FORMAT(f.fin_occur, '%m/%d/%Y') AS `Date`,
	f.fin_added,
	SUM(f2.fin_amt) as `Balance`
FROM finances f,
finances f2
WHERE f.fin_domain = 'CC'
	AND DATE_FORMAT(f.fin_occur, '%Y') = '2007'
	AND DATE_FORMAT(f.fin_occur, '%m') = '10'
	AND f2.fin_domain = 'CC'
	AND f2.fin_occur <= f.fin_occur
GROUP BY f.fin_id
ORDER BY f.fin_occur, f.fin_added


Here's the result set I get back:

[http://www.jrh3k5.net/misc/cc.jpg]

The problem, as may be able to see, is that the running sum is the sum of an entire day, not the sum at the point of an individual transaction.

My problem is that I can't qualify f2.fin_id <= f.fin_id because the data set is ordered by the two date columns, making the ordering of the ID column useless. I can't use f2.fin_added <= f.fin_added because they weren't necessarily added in a linear order (notice the the tuxedo rental was added before the Price Chopper transaction).

Any ideas?

[ Edit: Holy crap, that's the last time I use MS Paint to make a JPEG ]
the idiot is the person who follows the idiot and your not following me your insulting me your following the path of a idiot so that makes you the idiot - LC Tusken
2007-10-25, 10:15 PM #2
I don't get what you're trying to do. It seems like you're missing some important columns. If you have an ordering system, you should have an order_id, which specifies the entire order, for example, the user who ordered, the date it was ordered, when they checked out, etc. Then you should have a separate table for line items, and the line item table should have a foreign key to the order_id of the order table, then it should have a separate line_item_id or something similar. Then, you select from the line_items table, join it to the orders table, and group by ORDER_ID. That way, you will have a SUM that groups everything from a single order.

The data you're showing here is confusing because you have things with completely different ids and text descriptions, but the same SUM, which leads me to believe that you're either missing some information from your schema (bad) or you're not showing all the possible columns we have to choose from.
2007-10-25, 10:16 PM #3
Oh wait, I see what you're trying to do, hold on.
2007-10-25, 10:19 PM #4
I think you need to get rid of this:
AND f2.fin_domain = 'CC'
AND f2.fin_occur <= f.fin_occur

And do:
AND f2.fin_id = f.fin_id
2007-10-25, 10:23 PM #5
Meh, I searched on google for this type of thing, and it's not exactly straightforward. I found two decent links on doing running totals in SQL:

http://www.1keydata.com/sql/sql-running-totals.html
http://www.databasejournal.com/features/mssql/article.php/3112381

I've actually never run into this problem personally, so I don't have any experience with it. Is it just me or are you not showing deposits in your screenshot above? All I see are deductions, but the balance is going up and down like crazy :)
2007-10-26, 3:01 AM #6
Sometimes, it's okay to leave processing up to the client. A running total like this would seem an obvious candidate. The client gets the total balance of all transactions before the data returned, and then it can step through the dataset, adding to the the running total variable and displaying it at each step.

I know there is a good way of doing this in an Oracle SQL query, that actually calculates it as a running total, but doing it like this with a cross join is just horrible. I wouldn't hold my breath for an equivalent feature in MySQL.
2007-10-26, 3:32 AM #7
Having said that, the only reason I can see for your problem is that fin_occur is just a date, without a time component. You'll need to expand your compare so it matches your ORDER. Try something like this:
Code:
	AND (    f2.fin_occur < f.fin_occur
	      OR (     f2.fin_occur = f.fin_occur
	           AND f2.fin_added <= f.fin_added ) )
2007-10-26, 5:13 AM #8
To note, this if for a simple personal finance application I wrote.

Originally posted by Brian:
I think you need to get rid of this:
AND f2.fin_domain = 'CC'
AND f2.fin_occur <= f.fin_occur

And do:
AND f2.fin_id = f.fin_id


The problem with this is that it won't get a sum of all transactions that precede it - it'll only tell me what the sum is of that specific transaction.

Originally posted by Brian:
Is it just me or are you not showing deposits in your screenshot above? All I see are deductions, but the balance is going up and down like crazy :)


It's for a credit card (thus, 'f.fin_domain = 'CC'), so the balance will go up and down - notice the $-901.42. :) Don't worry - I'm not driving myself into complete debt!

Originally posted by Giraffe:
The client gets the total balance of all transactions before the data returned, and then it can step through the dataset, adding to the the running total variable and displaying it at each step.


Unfortunately, this program was an experiment of writing a VB .NET application that interfaces with MySQL (I'm severely considering re-writing it in Java), and the VB .NET datagridview object doesn't allow me to add a row at a time, and I've yet to find a good tutorial on how to step through the data set that gets attached to the object. :(

I'll try your WHERE clause when I get home - thanks!
the idiot is the person who follows the idiot and your not following me your insulting me your following the path of a idiot so that makes you the idiot - LC Tusken
2007-10-26, 8:53 AM #9
Originally posted by Wolfy:
Unfortunately, this program was an experiment of writing a VB .NET application that interfaces with MySQL (I'm severely considering re-writing it in Java), and the VB .NET datagridview object doesn't allow me to add a row at a time, and I've yet to find a good tutorial on how to step through the data set that gets attached to the object. :(


I can help you with that. Once you know the trick it is really really simple. I haven't time to go in to it now, but I'll try and get around to it next week. Shoot me a pm if you like.
2007-10-29, 4:54 AM #10
Okay, I've figured out that it's not as easy in pure VB.NET. C# has a thing called customer iterators, but there's no equivalent in VB.Net, so you'd need to create a class implementing IEnumator<> yourself, which is a lot of effort you don't want.

I'll show you how easy it is in C# just to tempt you to ditch VB.NET, seeing as it's a load of rubbish. The main idea to take away from this is to implement a data access layer in your app, rather than binding directly to a SQL query, which is all sorts of horrible (depending on a specific type of underlying database, mixing presentation and logic, ...)

Firstly, create a couple of classes to represent the model of your transactions:

Code:
    public class Transaction
    {
        public Transaction(int id, string what, double amount, DateTime transactionDate, DateTime addedDate)
        {
            _id = id;
            _what = what;
            _amount = amount;
            _transactionDate = transactionDate;
            _addedDate = addedDate;
        }

        private int _id;
        private string _what;
        private double _amount;
        private DateTime _transactionDate;
        private DateTime _addedDate;

                public int Id
        {
            get { return _id; }
            set { _id = value; }
        }
        public string What
        {
            get { return _what; }
            set { _what = value; }
        }
        public double Amount
        {
            get { return _amount; }
            set { _amount = value; }
        }
        public DateTime TransactionDate
        {
            get { return _transactionDate; }
            set { _transactionDate = value; }
        }
        public DateTime AddedDate
        {
            get { return _addedDate; }
            set { _addedDate = value; }
        }
    }

and another to represent a Transaction with a running total:
Code:
    public class RunningTransaction
    {
        public RunningTransaction(Transaction transaction, double runningTotal)
        {
            _transaction = transaction;
            _runningTotal = runningTotal;
        }

        private Transaction _transaction;
        private double _runningTotal;
        
        public int Id
        {
            get { return _transaction.Id; }
            set { _transaction.Id = value; }
        }
        public string What
        {
            get { return _transaction.What; }
            set { _transaction.What = value; }
        }
        public double Amount
        {
            get { return _transaction.Amount; }
            set { _transaction.Amount = value; }
        }
        public DateTime TransactionDate
        {
            get { return _transaction.TransactionDate; }
            set { _transaction.TransactionDate = value; }
        }
        public DateTime AddedDate
        {
            get { return _transaction.AddedDate; }
            set { _transaction.AddedDate = value; }
        }
        public double RunningTotal
        {
            get { return _runningTotal; }
            set { _runningTotal = value; }
        }
    }


Then I created a third class that encapsulated getting the data out of the database:
Code:
    public class DataAccess
    {
        // This is the results of your MySQL query:
        private IEnumerable<Transaction> GetTransactions()
        {
            yield return new Transaction(21, "Home Depot (Coffee Table)", 20.29, new DateTime(2007, 10, 3), new DateTime(2007, 10, 7, 19, 17, 15));
            yield return new Transaction(23, "Best Buy (Mario Party + Wiimote)", 96.75, new DateTime(2007, 10, 3), new DateTime(2007, 10, 7, 19, 17, 45));
            yield return new Transaction(22, "Price Chopper", 6.63, new DateTime(2007, 10, 4), new DateTime(2007, 10, 7, 19, 17, 31));
            yield return new Transaction(27, "Connections (Ice Cream Sandwiches)", 3.29, new DateTime(2007, 10, 5), new DateTime(2007, 10, 7, 19, 19, 54));
        }
        // This is the result of another query to calculate the starting balance of the above query
        private double GetStartingValue()
        {
            return 264.17;
        }
    }

Obviously, you'd replace the hardcoded values with calls to the database. The trick to note here is that we're not returning a dataset, but objects from our model. The GetTransactions() method returns a type of IEnumerable<> meaning that it can be used in a foreach loop and, more importantly, as the DataSource for a control that DataBind()s.
Obviously, though, that simply returns an iteration over Transactions, without caring about the running total, and doesn't care about the StartingValue either... The answer is another customer iterator method in the class:
Code:
        // This is what you provide to your datagridview control
        public IEnumerable<RunningTransaction> GetTransactionsWithRunningTotal()
        {
            double runningTotal = GetStartingValue();
            foreach (Transaction t in GetTransactions())
            {
                runningTotal += t.Amount;
                yield return new RunningTransaction(t, runningTotal);
            }
        }

So we're using the iteration over Transactions to create another iteration over RunningTransactions.

The key to both of the methods is the "yield" keyword, which is the thing missing from VB.NET. This saves us having to write the IEnumerator object ourselves, it's all handled transparently for us. I could go on at length about how beautiful custom iterators are, but now I'll show you how to use this in a DataGridView.

I'm using Visual Studio, so I was able to have the C# class library project and the VB.NET gui in the same solution. The gui project referenced the C# library project. Starting from a new form then all I needed to do was add a DataGridView control and a BindingSource control. Then in the code for the form, I only needed FIVE lines of code to get the DGV populating:
In the declarations section:
Code:
Imports DataAccess.DataAccess
Imports DataAccess.RunningTransaction

(The namespace of my C# classes was "DataAccess")
Note you do not need to import the Transaction class - we really don't care about it at the gui level.

Then, Sub New() becomes:
Code:
    Public Sub New()

        ' This call is required by the Windows Form Designer.
        InitializeComponent()

        ' Add any initialization after the InitializeComponent() call.
        Dim da As DataAccess.DataAccess = New DataAccess.DataAccess()
        DataGridView1.DataSource = BindingSource1
        BindingSource1.DataSource = da.GetTransactionsWithRunningTotal()

    End Sub


That's all that is required to get the data populating. With a bit of digging, I'd probably figure out how to get it binding at design time, which is how I assume you're doing it with the SQL query?

I also assume you're using VB.NET Express, which won't allow you to put the two projects in one solution, so you would need to build the C# project in VC# Express and then reference the dll output from VB.NET.

As I said, I'd encourage you to ditch VB.NET and make the jump to C#, as it's a much more powerful language. I didn't think VB.NET was as far behind C# as it apparently is.
2007-10-29, 5:23 AM #11
The image is fine, it's the forums that ruin it.
2007-10-29, 7:36 AM #12
Originally posted by Wolfy:
Code:
SELECT 
	f.fin_id AS `id`, 
	f.fin_text AS `What`, 
	concat('$', f.fin_amt) AS `Amount`, 
	DATE_FORMAT(f.fin_occur, '%m/%d/%Y') AS `Date`,
	f.fin_added,
	SUM(f2.fin_amt) as `Balance`
FROM finances f,
finances f2
WHERE f.fin_domain = 'CC'
	AND DATE_FORMAT(f.fin_occur, '%Y') = '2007'
	AND DATE_FORMAT(f.fin_occur, '%m') = '10'
	AND f2.fin_domain = 'CC'
	AND f2.fin_occur <= f.fin_occur
GROUP BY f.fin_id
ORDER BY f.fin_occur, f.fin_added


I googled around and came up with:

Code:
SET @total - 0;
SELECT 
	fin_id AS `id`, 
	fin_text AS `What`, 
	concat('$', fin_amt) AS `Amount`, 
	DATE_FORMAT(fin_occur, '%m/%d/%Y') AS `Date`,
	fin_added,
	@total := @total + fin_amt AS `Balance`
FROM finances
WHERE fin_domain = 'CC'
	AND DATE_FORMAT(fin_occur, '%Y') = '2007'
	AND DATE_FORMAT(fin_occur, '%m') = '10'
GROUP BY fin_id
ORDER BY fin_occur, fin_added;


And to make it pretty:

Code:
SET @total - 0;
SELECT 
	fin_id AS `id`, 
	fin_text AS `What`, 
	concat('$', fin_amt) AS `Amount`, 
	DATE_FORMAT(fin_occur, '%m/%d/%Y') AS `Date`,
	DATE_FORMAT(fin_added, '%m/%d/%Y') AS `Added`,
	concat('$', @total := @total + fin_amt) AS `Balance`
FROM finances
WHERE fin_domain = 'CC'
	AND DATE_FORMAT(fin_occur, '%Y') = '2007'
	AND DATE_FORMAT(fin_occur, '%m') = '10'
GROUP BY fin_id
ORDER BY fin_occur, fin_added;


That - in the set... is that right? I'd think it'd be a =...

Only problem with this might be that the totals will look fine until groups and orders change the row orders, rearranging the totals.. bleh.

Anyways this is someplace to start I suppose.

2007-10-29, 8:16 AM #13
Originally posted by Wolfy:
Unfortunately, this program was an experiment of writing a VB .NET application that interfaces with MySQL (I'm severely considering re-writing it in Java), and the VB .NET datagridview object doesn't allow me to add a row at a time, and I've yet to find a good tutorial on how to step through the data set that gets attached to the object. :(

I'll try your WHERE clause when I get home - thanks!

It's usually the RowCreated event.
Code to the left of him, code to the right of him, code in front of him compil'd and thundered. Programm'd at with shot and $SHELL. Boldly he typed and well. Into the jaws of C. Into the mouth of PERL. Debug'd the 0x258.
2007-10-29, 8:21 AM #14
Originally posted by JediGandalf:
It's usually the RowCreated event.


Ewww, are you seriously suggesting using an event fired by the display control to manipulate the underlying dataset?
2007-10-29, 8:34 AM #15
Use C# 3.0 with LINQ! [http://img98.imageshack.us/img98/3645/judgeoh6.gif]
Bassoon, n. A brazen instrument into which a fool blows out his brains.
2007-10-29, 2:55 PM #16
I wrote a small perl script to go and update each row with the maximum fin_occur + 1 second (within each original grouping of fin_occur) and modified my software to do the same. Now I simply where f2.fin_occur <= f.fin_occur

Something to keep in mind the next time I try to do something like this again...
the idiot is the person who follows the idiot and your not following me your insulting me your following the path of a idiot so that makes you the idiot - LC Tusken

↑ Up to the top!