No description

Find a file

Nolan Darilek a5e71c90ac All checks were successful Test / test (push) Successful in 1m44s Details Add README.		2025-06-17 11:33:07 -04:00
.forgejo/workflows	Initial commit.	2025-06-12 08:31:24 -04:00
src	Minor code cleanup.	2025-06-16 17:37:23 -04:00
.envrc	Initial commit.	2025-06-12 08:31:24 -04:00
.gitignore	Naive single-threaded Parquet writing.	2025-06-16 15:26:56 -04:00
.pre-commit-config.yaml	Initial commit.	2025-06-12 08:31:24 -04:00
Cargo.lock	Bump dependencies.	2025-06-16 17:39:55 -04:00
Cargo.toml	Combine rayon and a larger batch size for another modest gain.	2025-06-16 16:06:15 -04:00
data.sql	Load data into SQL tables.	2025-06-12 13:33:00 -04:00
flake.lock	Initial commit.	2025-06-12 08:31:24 -04:00
flake.nix	Bump dependencies.	2025-06-16 17:39:55 -04:00
README.md	Add README.	2025-06-17 11:33:07 -04:00

README.md

Assignment

To launch, install Docker or Podman and run:

$ cargo test
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.70s
     Running unittests src/lib.rs (target/debug/deps/supermetal_assignment-a8159c3e18c41f11)

running 1 test
(took 2197.842923 ms)
test test::test_mysql ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 29.89s
...

Future direction

There are a few obvious places to go with this, and you probably don't really care about these, but just so you know I know...:

Nothing is configurable outside of code. (E.g. the Parquet batch size is a constant.) Might have been better to do a CLI rather than a test. Production code would of course be more configurable/tunable.
My dataset is a bit lopsided--an employees table with 500K records and a departments table with only 9. This made it a bit difficult to definitively establish how much concurrency actually benefitted in some areas but intuitively it made sense, and would probably bear out on actual data.
As previously stated, I'm not familiar with Parquet/Arrow outside of my use of it with DuckDB/Overture Maps. There are probably additional ways to speed things up, and I look forward to learning about those when we work together.

Thanks, and do let me know if you have any additional questions.