Skip to content

Add Support HTTP compression#802

Merged
bgunebakan merged 4 commits into
mainfrom
700-support-http-compression
May 20, 2026
Merged

Add Support HTTP compression#802
bgunebakan merged 4 commits into
mainfrom
700-support-http-compression

Conversation

@bgunebakan
Copy link
Copy Markdown
Contributor

@bgunebakan bgunebakan commented May 15, 2026

Adds opt-in HTTP compression support to the crate-python driver to reduce bandwidth usage and improve performance for large queries.

Summary of the changes / Why this is an improvement

  • Client-side request compression (compress_client, default: True). Outgoing JSON request bodies are gzip-compressed before being sent.
  • Server-side response compression (compress_server, default: False). The driver sends Accept-Encoding: gzip, deflate to CrateDB server when it's enabled.
  • Compression is skipped for payloads below compress_threshold bytes (default 8192)
  • 6 new unit tests covering: compression disabled, compression enabled, below-threshold skip, Accept-Encoding sent/not sent, and default-args behavior.

Small note on threshold default value:

I've changed compress_threshold to 8192 while implementing. I ran a payload-size sweep against both a local CrateDB instance and a CrateDB Cloud cluster (eu-west-1 free tier), measuring round-trip latency with and without gzip compression. I saw benefits of compression when latency included over 8kb on local and 1kb on cloud instances.

Latency can vary based on my network and cluster settings/location. I decided to move on 8kb as a default value for threshold.

Checklist

  • Link to issue this PR refers to (if applicable): Fixes Support HTTP compression #700
  • Added or Changed code is covered by tests
  • Updated documentation & CHANGES.rst

@bgunebakan bgunebakan self-assigned this May 15, 2026
@bgunebakan bgunebakan linked an issue May 15, 2026 that may be closed by this pull request
3 tasks
@bgunebakan bgunebakan requested review from amotl and mfussenegger May 15, 2026 11:19
@amotl amotl requested review from matriv and seut and removed request for amotl and seut May 15, 2026 11:35
Copy link
Copy Markdown
Member

@mfussenegger mfussenegger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I think we can remove the compress_server setting and always set the header leaving it up to the server - given that in CrateDB compression also needs to be enabled with default being off.
  • I'd also remove the compress_algorithm setting - given that there is so far only gzip supported there's no point in the setting
  • I'd tend to merge compress_client / compress_threshold to a single setting:
    • number: Compressed if payload body > the number
    • bool true: Always compressed
    • bool false: Never compressed
      Otherwise the interaction between enabled/threshold seems more ambiguous and also sort of redundant to have both separate.

Didn't look into the details yet

@bgunebakan
Copy link
Copy Markdown
Contributor Author

Thanks for your review!

  • I think we can remove the compress_server setting and always set the header leaving it up to the server - given that in CrateDB compression also needs to be enabled with default being off.

Agree to remove it and always send Accept-Encoding: gzip, deflate. I had it as an opt-in partly out of caution around BREACH, but we can leave this control to server config.

  • I'd also remove the compress_algorithm setting - given that there is so far only gzip supported there's no point in the setting

Agree to remove it if there is no plan to implement other algorithms.

  • I'd tend to merge compress_client / compress_threshold to a single setting:

    • number: Compressed if payload body > the number
    • bool true: Always compressed
    • bool false: Never compressed
      Otherwise the interaction between enabled/threshold seems more ambiguous and also sort of redundant to have both separate.

Didn't look into the details yet

Make sense to merge compress parameters. The type check needs to test for bool before int, otherwise compress=True would be interpreted as threshold 1.

Copy link
Copy Markdown
Member

@mfussenegger mfussenegger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some minor suggestions, otherwise lgtm

Comment thread src/crate/client/http.py Outdated
socket_tcp_keepintvl=None,
socket_tcp_keepcnt=None,
jwt_token=None,
compress=8192,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
compress=8192,
compress: Union[int, bool] = 8192,

I think it would be good if we started adding type annotations - at least for the public API

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added types, I'd like to tackle all annotations in separate PR. I needed to add for _inactive_servers and server_pool because mypy were complaining.

Comment thread src/crate/client/http.py
Comment on lines +686 to +688
compress_enabled = self.compress is True or (
not isinstance(self.compress, bool) and len(data) >= self.compress
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
compress_enabled = self.compress is True or (
not isinstance(self.compress, bool) and len(data) >= self.compress
)
compress_enabled = self.compress is True or (
isinstance(self.compress, int) and len(data) >= self.compress
)

Or might even make it stricter and fail if it's neither bool nor int.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added check in __init__ and will fail if it's not int or bool

@bgunebakan bgunebakan merged commit dc3372f into main May 20, 2026
17 checks passed
@bgunebakan bgunebakan deleted the 700-support-http-compression branch May 20, 2026 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support HTTP compression

2 participants