Informatics, digital & computational pathology

Database

Structured query language


Editorial Board Member: Lewis A. Hassell, M.D.
Editor-in-Chief: Debra L. Zynger, M.D.
Jerome Cheng, M.D.

Last author update: 30 August 2022
Last staff update: 30 August 2022

Copyright: 2019-2024, PathologyOutlines.com, Inc.

PubMed Search: Structured query language

Jerome Cheng, M.D.
Page views in 2023: 204
Page views in 2024 to date: 5
Cite this page: Cheng J. Structured query language. PathologyOutlines.com website. https://www.pathologyoutlines.com/topic/informaticssql.html. Accessed December 4th, 2024.
Definition / general
  • Through a database management system (DBMS), structured query language (SQL) is the language used for interacting with relational databases commonly used in research systems, laboratory information systems (LIS), ecommerce websites, reservation systems and other applications that store and manage data
  • When a search for a test result in your laboratory information system is conducted, an SQL query is sent to your database management system to retrieve your data of interest; the result is returned to your laboratory information system software and eventually displayed on your screen
Background
  • Relational databases, such as those within laboratory information systems, consist of stored data in the form of multiple tables linked to one another; these laboratory databases are composed of hundreds / thousands of tables, with each table containing as many as millions of rows of data
    • For instance, a table containing lab order details may have one row of data for each lab order (each row contains various order details: order date, test ordered, order status)
  • In a laboratory information system, there will be separate tables for patient information, patient visits, ordered tests, specimens, test results and test details
    • Note: many electronic medical records (another type of hospital software system) are based on a different type of database (hierarchical) that follows a tree-like structure; these databases may not be queried using SQL
  • Some tools, like Microsoft Access and LibreOffice Base (an open source database application), enable you to graphically link tables together to create an SQL query without having to write the SQL statement yourself
  • Data from older data storage systems may be migrated to relational databases through a process referred to as ETL (extract, transform, load), where data stored from the original system is extracted and manipulated to fit the formatting requirements of the relational database system
Essential features
  • SQL is the language used for interacting with relational databases
  • In the hospital and laboratory setting, an SQL query will be the predominant way of directly accessing data from research and laboratory information system databases
    • Users are often granted read only rights; commands that alter the contents of the database are reserved for database administrators (e.g., UPDATE, DELETE, INSERT)
  • SQL may be used for cohort selection, result retrieval, management report and financial statement preparation
  • A data dictionary and entity relationship diagram (ERD) is often available to assist in navigating through the various database tables of a laboratory information system
Terminology
  • Database: a system for storing information that allows efficient storage, management and retrieval of data
  • SQL query: a statement for retrieving data from a relational database
  • Field: data element in a row that falls under a column
  • Entity relationship diagram: shows the relationships among different tables within a database (e.g., one to one, one to many)
  • Data dictionary: provides details about each database table, including the content of each field and what field may be linked with another table (e.g., an order table may contain a patient ID field that can be linked with the patient ID field within the patient table)
Applications
Basic SQL commands
  • SELECT: most important and frequently used by the average user
    • Retrieves data from a table or multiple tables
    • Primary method of retrieving data from relational databases is through a SELECT statement; at the bare minimum, it should consist of the SELECT command followed by the columns of interest, the FROM clause, followed by a table name (e.g., SELECT FIRST from patient [this statement returns all first names from the patient table])
    • A query on a single table follows the following structure:
      • SELECT [columns] FROM [table] WHERE [conditions] ORDER BY [column(s)]
    • Fields to be retrieved are specified after the SELECT command
    • Optional WHERE clause specifies the inclusion or exclusion criteria for the data being retrieved; although optional, most SELECT statements will have a WHERE clause, unless the purpose of the SELECT statement is to retrieve all the contents of a table or tables
    • Optional ORDER clause specifies the sorting order and priority order for a column or columns
    • JOIN clause: combines data from 2 tables
      • INNER JOIN: returns rows that match for both tables
      • LEFT OUTER JOIN: returns all rows from the first table and all matching rows from the second table
      • RIGHT OUTER JOIN: returns all rows from the second table and all matching rows for the first table
    • WHERE clause: conditional statement
      • LIKE operator: matches a specified pattern
    • Aggregating functions - COUNT, SUM, AVE, MIN, MAX: give the count, total, average, minimum or maximum value of a set of items, which are specified by the GROUP BY clause 
  • UPDATE: updates contents of a table
  • INSERT INTO: adds data to a table
  • DELETE: removes item(s) from a table
  • DROP TABLE: deletes a table
  • CREATE TABLE: creates table and specifies the data type for each column
  • ALTER TABLE: adds / removes column(s) from a table, alters column data types (e.g., string, date or numeric)
  • Reference: W3Schools: SQL Tutorial [Accessed 25 August 2022]
Examples
  • Example queries and results are based on the 2 tables below (Orders, Patient)

    • Orders:

      ID PATIENT_ID ORDERED_DATE TEST_ID TEST_NAME
      1 53053 20151014 CBC Complete blood count
      2 53053 20151212 FRTN Ferritin
      3 53404 20151212 LIPID Lipid panel
      4 53404 20151126 A1C Hemoglobin A1C
      5 52100 20151205 BASIC Basic metabolic panel
      6 52100 20151204 CBC Complete blood count
      7 52117 20151201 TRIG Triglycerides
      8 52117 20151204 CHOL Cholesterol, total
      9 52136 20151124 BASIC Basic metabolic panel
      10 52136 20151124 CBC Complete blood count

    • Patient:

      ID  MRN FIRST LAST ZIP
      53053 800003295 WINNIE TEST 48176
      53404 800002796 GEORGE FIVE 43619
      52100 060000384 JOHN TRAIN 48112
      52117 060000311 WILLIAM ADULT 48107
      52136 800002844 PATIENT RESEARCH 43619

  • SQL query example #1: SELECT TEST_ID FROM Orders
    • In this example, the TEST_ID field was retrieved from each row
    • Adding the DISTINCT keyword after SELECT will instruct the query to retrieve unique values only (see example #2)
    • Query result:

      TEST_ID
      CBC
      FRTN
      LIPID
      A1C
      BASIC
      CBC
      TRIG
      CHOL
      BASIC
      CBC

  • SQL query example #2: SELECT DISTINCT TEST_ID FROM Orders
    • Query result:

      TEST_ID
      A1C
      BASIC
      CBC
      CHOL
      FRTN
      LIPID
      TRIG

  • SQL query example #3: SELECT * FROM Patient
    • Returns entire content of the patient table
    • Query result:

      ID  MRN FIRST LAST ZIP
      53053 800003295 WINNIE TEST 48176
      53404 800002796 GEORGE FIVE 43619
      52100 060000384 JOHN TRAIN 48112
      52117 060000311 WILLIAM ADULT 48107
      52136 800002844 PATIENT RESEARCH 43619

  • SQL query example #4: SELECT * FROM Orders WHERE TEST_ID ='CBC'
    • Retrieves every row with a TEST_ID of 'CBC'
    • In a SELECT statement, the '*' wildcard character denotes every column will be returned from each row that matches the selection criteria, the selection criteria being TEST_ID = 'CBC' in this case
    • Query result:

      ID PATIENT_ID ORDERED_DATE TEST_ID TEST_NAME
      1 53053 20151014 CBC Complete blood count
      6 52100 20151204 CBC Complete blood count
      10 52136 20151124 CBC Complete blood count

  • SQL query example #5:
    • SELECT TEST_ID, COUNT (TEST_ID) FROM Orders
    • GROUP BY
    • TEST_ID
    • Query result:

      TEST_ID
      A1C 1
      BASIC 2
      CBC 3
      CHOL 1
      FRTN 1
      LIPID 1
      TRIG 1
Board review style question #1

For which type of database are SQL queries used?

  1. Hierarchical
  2. Network
  3. Nosql
  4. Relational
Board review style answer #1
D. Relational. SQL queries are used for retrieving data from relational databases.

Comment Here

Reference: Structured query language (SQL)
Board review style question #2
What does the following SQL statement do?

SELECT * FROM Orders WHERE TEST_ID ='CBC'

  1. Deletes every row in the orders table with a TEST_ID of 'CBC'
  2. Retrieves every row in the orders table
  3. Retrieves every row in the orders table with a TEST_ID of 'CBC'
  4. Returns every order from the TEST_ID table
Board review style answer #2
C. Retrieves every row in the orders table with a TEST_ID of 'CBC'. The statement retrieves every row in the orders table that has a TEST_ID of 'CBC'.

Comment Here

Reference: Structured query language (SQL)
Back to top
Image 01 Image 02