In my Flask application, I have something similar to a bank account: One User
has one Account
, credit entries are modeled as Incomings
, deductions are modeled as Outgoings
.
The problem:
Get an "account statement" for one user, i.e. credit entries / deductions per day, e.g.
Thu 29 Aug 2019
Some deduction: -23.00
Some credit: 123.00
Fri 30 Aug 2019
Big credit: 4223.00
Another deduction: -42.00
My data model:
This is what (a simplified version of) my models.py
looks like:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Float, ForeignKey, Integer, Text, TIMESTAMP
from sqlalchemy.orm import relationship
Base = declarative_base()
class Account(Base):
__tablename__ = 'account'
id = Column(Integer, primary_key=True)
balance = Column(Float, nullable=False)
userID = Column(Integer, ForeignKey('user.id'))
incomings = relationship("Incoming", back_populates="account")
outgoings = relationship("Outgoing", back_populates="account")
user = relationship("User", back_populates="account")
class Incoming(Base):
__tablename__ = 'incoming'
id = Column(Integer, primary_key=True)
accountID = Column(Integer, ForeignKey('account.id'))
amount = Column(Float, nullable=False)
description = Column(Text, nullable=False)
timestamp = Column(TIMESTAMP, nullable=False)
account = relationship("Account", back_populates="incomings")
class Outgoing(Base):
__tablename__ = 'outgoing'
id = Column(Integer, primary_key=True)
accountID = Column(Integer, ForeignKey('account.id'))
amount = Column(Float, nullable=False)
description = Column(Text, nullable=False)
timestamp = Column(TIMESTAMP, nullable=False)
account = relationship("Account", back_populates="outgoings")
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(Text, nullable=False)
account = relationship("Account", back_populates="user")
My general intended approach:
- get all
Incomings
for user, grouped by day - get all
Outgoings
for user, grouped by day - somehow merge the two lists, grouped by day
My background:
It's been a while since I've worked with the underlying database PostgreSQL (but then, I did manage to set up a trigger function to auto-update the balance), but as far as SQLAlchemy (the ORM in use) is concerned, I merely seem to have scratched the surface.
Step 1: get all Incomings
for user, grouped by day
Following the first SO hit, I tried
from sqlalchemy import func
# existing sample account ID
accountID = 42
# not relevant to the point at hand, known to work
db_session = get_a_scoped_session_from_elsewhere()
db_incomings = db_session.query(Incoming) \
.filter(Incoming.accountID == accountID) \
.group_by(func.day(Incoming.timestamp)) \
.all()
but this fails with
ProgrammingError: (psycopg2.errors.UndefinedFunction) ...
... function day(timestamp without time zone) does not exist
which seems to indicate that PostgreSQL doesn't support day
.
According to this SO answer,
# imports and variables as above
db_incomings = db_session.query(Incoming) \
.filter(Incoming.accountID == accountID) \
.group_by(func.date_trunc('day', Incoming.timestamp)) \
.all()
works for PostgreSQL, but for me fails with
ProgrammingError: (psycopg2.errors.GroupingError) ...
... column "incoming.id" must appear in the GROUP BY clause ...
... or be used in an aggregate function
When I just blindly try to do what the error message tells me and add incoming.id
to the GROUP BY
clause as in
db_incomings = db_session.query(Incoming) \
.filter(Incoming.accountID == accountID) \
.group_by(Incoming.id,
func.date_trunc('day', Incoming.timestamp)) \
.all()
the code works, but does not return the wanted result; instead, I get a list of objects like
{'timestamp': datetime.datetime(2019, 8, 29, 10, 4, 27, 459000), 'id': 1, 'accountID': 42, ...}
{'timestamp': datetime.datetime(2019, 8, 29, 10, 8, 21, 493000), 'id': 2, 'accountID': 42, ...}
{'timestamp': datetime.datetime(2019, 8, 29, 10, 8, 42, 660000), 'id': 3, 'accountID': 42, ...}
which isn't really surprising, considering I'm grouping by Incoming.id
.
Trying to understand the underlying issue (see e.g. here or here), it seems I cannot reference a field on the SELECT statement (i.e. the SQLAlchemy .query
) if it doesn't appear on the GROUP BY clause (i.e. the SQLAlchemy .group_by
). Looking at the error message, this also seems to be the case vice versa.
I've been wrecking my brain for a couple of hours now, found lots of alternatives to func.date_trunc
and have 800 browser tabs open, but still no idea how to approach this.
My question: How do I need to structure / build up the SQLAlchemy query ?
from SQLAlchemy: group by day over multiple tables
No comments:
Post a Comment