> For the complete documentation index, see [llms.txt](https://www.isaacbowen.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://www.isaacbowen.com/2024/04/26.md).

# 20240426

<div align="left"><figure><img src="/files/gzgyeIf2K1h6X9U3AMpb" alt="" width="188"><figcaption></figcaption></figure></div>

{% hint style="info" %}
These are my notes from [PGDay Chicago 2024](https://2024.pgdaychicago.org/)! They may or may not be useful for you! They are here mostly as a reference for me. :)
{% endhint %}

## Hettie <a href="#feoy6nfri6zg" id="feoy6nfri6zg"></a>

Time dimensions: This is something beyond common sense in the real life, however, we need them

GiST has exclusion constraints - solves the temporal primary key problem for periods

Fully temporal queries are still tricky (we don't know how to write them!)

The refinements of manipulation semantics

* INSERT
* UPDATE
* CORRECTION
* INACTIVATE
* DELETE

Effective vs assertive time: our knowledge of what will be (effective), where that knowledge is current as of a certain time (assertive)

"I am little bit like electron, everywhere and nowhere"

In real life, there are very few objects that are constantly changing. Storage costs correlate with changes that occur, as is natural.

## Drako <a href="#ja7vei6q6t7o" id="ja7vei6q6t7o"></a>

EXPLAIN output illustrates the execution of the query

Deepest elements of the query are executed first, naturally - see the deeper elements of the returned tree to understand the formation of the cost total

width=20 : bytes consumed\
cost=0.0..1.0 : startup cost, total cost\
time=0.0..1.0 : startup time, total time\
loops=2 : total number of executions of the node

cost is figurative, an arbtirary unit. bigger number means it's harder to run. don't compare cost between queries; cost scale is only consistent within EXPLAIN results for a single query

seq scan is a literal sequence scan, one row at a time. **can sometimes be cheaper than index scans.**

bitmap index scan involves two indexes, choosing the result of a boolean operation between two index scans (could be the same index twice, or two different indexes)

nested loop: for each element from a scan, do another scan based on that element

hash join: creates an in-memory hash table from the results of another scan, then scans the keys of that hash table

merge join: sorts two data sets, merges them

(aside: "LLibran"? the "ll\_" from Hettie's talk?)

tools

* pgadmin
* explain.depesz.com - expert
* explain.dalibo.com - visual tree thing, send to Matt
* pgmustard - beginner-friendly

google this: "explaining the unexplainable" (depesz or something)

## Lukas <a href="#m9wz2750treo" id="m9wz2750treo"></a>

EXPLAIN ANALYZE, EXPLAIN BUFFERS (woah what is *buffers*)

hypopg: "what would be the estimated cost of this query, if this index existed?"

index selection should account for write overhead, not just query efficiency

"costs are arbtirary units. they do not represent milliseconds or any other unit of time. instead, they are anchored to a single read of a sequential page, which costs 1.0 unit." -Tadeas Petak, paraphrasing pg docs

suggestion: EXPLAIN all table queries, collect all recurring scans

Index Write Overhead = the estimated size of an index write (in bytes), based on the index definition, divided by the size of the average table row

Optimization: (1) Find a good solution to a problem; (2) How? Heuristics; Exact methods (MIP, CP, etc.)

whitepaper: A Constraint Programming Approach for Index Selection in Postgres

1\. Minimize the costs, resulting in x\
2\. New rule: the cost cannot be higher than x + 10% (tolerance!)\
3\. Maximize the performance, while respecting cost + tolerance

INSERT INTO test SELECT \*, 'testdatatestdata' FROM generate\_series(1, 100000);

EXPLAIN (GENERIC\_PLAN) SELECT \* FROM test WHERE data = $1;

remember HOT updates are an option, *but only for unindexed data*

## Karen, Stacey <a href="#t1fototz4l21" id="t1fototz4l21"></a>

Harvard University Implicit Bias test

"You can't compete with me — I want you to win, too."

hubspot: 90% of women suffer from imposter syndrome

Deprogramming: Delete, Delegate, Delay, Do

allyship: use one's power, position, or privilege to uplift others

advocacy, mentorship, amplification

"I want to be a part of the people that make meaning"

## Christophe <a href="#id-8l0z65sk4xwd" id="id-8l0z65sk4xwd"></a>

w3.org/International/questions/qa-personal-names

linuxmafia.com/kb/Essays/marriage.html

"this is what happens if you think you can model famliies with a database schema" -- see: sony family plans

ultimately, the takeaway is to only include operationally necessary elements in your data model, and only require what is strictly necessary for operation

**Under-model. Under-collect. Over-communicate. Focus on the core need.**

DOB is very discoverable btw - not useful for authentication

## Derk <a href="#njx8bkrflqfo" id="njx8bkrflqfo"></a>

extensions used in toast recovery:\
pageinspect, pg\_visibility, pg\_surgery

## Bruce Momjian <a href="#oyo8o69v6f7o" id="oyo8o69v6f7o"></a>

pgmustard.com/docs/explain

UNION, EXCEPT, and INTERSECT default to being unique. add ALL to get everything.

things Isaac doesn't know:

* window functions
* ROLLUP()
* MATERIALIZED
* can I use MEMOIZE to replace the loose index scans that the Mechanic shop scheduler is using?
* a "project" you do is pronounced differently than your intent to "project"
* "tid" and "ctid" refer to toast ID
* TRUNCATE is a DDL, not a DML, and as such isn't supported by EXPLAIN
* custom scan providers

## Jonathan <a href="#j5luakk14ysk" id="j5luakk14ysk"></a>

HNSW for pgvector

## Shane <a href="#pyk6fs1wkxao" id="pyk6fs1wkxao"></a>

money\_type is text:

* **postgres queries support hints??????**
* create index (user\_id, id, money\_type) - **Index Scan Backward**
* create index (user\_id) INClUDE (id, money\_type)

## Rekha <a href="#fdshppisdv9e" id="fdshppisdv9e"></a>

Hugging Face: open-source ai models

**you need to choose the right model for your use-case**

**don't be afraid. just learn, and enjoy your journey. I scratch my head, but we all do for our day-to-day thing. AI is no big thing. we can hack it.**

## David <a href="#vgre42lofkni" id="vgre42lofkni"></a>

JSON\_TABLE in pg17

**mechanic: time to start storing task results as a json array of objects, probably**

## Drako <a href="#id-198sweb7u6ia" id="id-198sweb7u6ia"></a>

(years ago) I chose postgres - I was a rebel.

today, I am proud to use postgres. I am proud to be a rebel.

## Jimmy <a href="#m1ngfyxk0txk" id="m1ngfyxk0txk"></a>

in memoriam: Simon Riggs

"when analyzing satellite data, he discovered that he had a passion for databases"

keynote at pgconf europe 2023: "PostgreSQL: The Next 20 Years"


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://www.isaacbowen.com/2024/04/26.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.