SQL Style
2023-01-19 SQL style guideThis is set of rules I use when composing my SQL and designing my databases. However more often than not I have to use a database designed by some external team that use their own rules.
- consistent and descriptive names that use domain terminology
- lowercase for table and column names, put underscore
_
where would be space - avoid abbreviations unless they are common in the domain
- prefer numeric primary key named
id
, autogenerated - foreign keys named according to target table, e.g.
user_id
, use them to ensure data integrity. Also define yourON UPDATE
andON DELETE
to simplify your database operations and prevent ghost data - keywords and functions in uppercase, always use
AS
keyword - in lists start lines with the comma, it is easier to comment out or add lines
- use whitespace to make the query easy to read, line up the items
- sub-queries use similar indentations, wrapped in the parenthesis
- prefer
BETWEEN
andIN
to combinedAND
/OR
s - prefer
ENUM
with descriptive values to cryptic codes
All above are useful, but probably most important is cleanup of the data before they get into the database and all effort to make sure the data stays consistent. Before loading random Excel produced by management into the database trim all data, define enumeration of values for columns and fix anything that does not fall into defined buckets.
Small formatting example
SELECT
team.name AS Teamname
, team.id AS TeamId
, user.name AS Approvername
, user.id AS UserId
FROM
team_user AS TU
JOIN team ON team.id = TU.team_id
JOIN user ON user.id = TU.user_id
WHERE
TU.role = 'Lead'
ORDER BY
team.name ASC
Somewhat larger example
SELECT
PA.name AS `Moderator`
, PR.name AS `Project`
, P.review_id AS `ReviewID`
, P.filename AS `Filename`
, P.completed_date AS `Approval date`
, P.total_product_size AS `Size [pages]`
, P.sum_review_time AS `Hours`
, COUNT(DISTINCT CT.id) AS `Total Technical`
, COUNT(DISTINCT CP.id) AS `Total Process`
, COUNT(DISTINCT CN.id) AS `Total Non-technical`
, COUNT(DISTINCT CS.id) AS `Total Suggestion`
, COUNT(DISTINCT CD.id) AS `Total Duplicate`
, COUNT(DISTINCT CND.id) AS `Total Not a defect`
, PA.review_time AS `Hours`
, COUNT(DISTINCT CASE WHEN CT.author LIKE CONCAT('%', PA.name ,'%') THEN CT.id END) AS `Technical`
, COUNT(DISTINCT CASE WHEN CP.author LIKE CONCAT('%', PA.name ,'%') THEN CP.id END) AS `Process`
, COUNT(DISTINCT CASE WHEN CN.author LIKE CONCAT('%', PA.name ,'%') THEN CN.id END) AS `Non-technical`
, COUNT(DISTINCT CASE WHEN CS.author LIKE CONCAT('%', PA.name ,'%') THEN CS.id END) AS `Suggestion`
, COUNT(DISTINCT CASE WHEN CD.author LIKE CONCAT('%', PA.name ,'%') THEN CD.id END) AS `Duplicate`
, COUNT(DISTINCT CASE WHEN CND.author LIKE CONCAT('%', PA.name ,'%') THEN CND.id END) AS `Not a defect`
, P.OversightType AS `OversightType`
FROM
packets P
JOIN programs PR ON PR.id = P.program_id
JOIN participants PA ON PA.packet_id = P.id
LEFT JOIN comments CT ON (P.id = CT.packet_id AND CT.IsDefectState = 'DefectTechnical')
LEFT JOIN comments CP ON (P.id = CP.packet_id AND CP.IsDefectState = 'DefectProcess')
LEFT JOIN comments CN ON (P.id = CN.packet_id AND CN.IsDefectState = 'DefectNonTechnical')
LEFT JOIN comments CD ON (P.id = CD.packet_id AND CD.IsDefectState = 'Duplicate')
LEFT JOIN comments CS ON (P.id = CS.packet_id AND CS.IsDefectState = 'Observation')
LEFT JOIN comments CND ON (P.id = CND.packet_id AND CND.IsDefectState IN ('Nondefect', 'NotDefect'))
WHERE
PA.role LIKE '%Moderator%'
GROUP BY
P.id
ORDER BY
Moderator