Contest Judging

Qualification Requirements

Entries must meet the following requirements to qualify for the competition:

All collateral contained in a github repository, shared with contest operator (Supranational), including everything needed to run your model. This may include:
- Code - RTL, software, scripts
- Documentation
- Constraint files
- TCL scripts
- Makefiles
Reasonable documentation of the design, including
- High level algorithm - include architectural drawings, formulas, pseudo-code, models (python, etc.)
- Key implementation details
- Detailed instructions to reproduce all inputs and results
Conforms to the specified modular squaring interface
Simulates successfully with the provided modulus
- Vivado behavioral simulation to 10k iterations
- SDAccel hardware emulation passes to 10 iterations
Synthesizes and Implements successfully in AWF F1 SDAccel flow
Executes and produces the correct result on AWS F1 FPGA hardware for 1B iterations using a random input
Complies with AWS F1 usage agreements
Complies with this contest official rules

Estimate performance for all qualifying designs using the SDAccel synthesis clock freq and simulation cycles/sq. For example, given 8 cycles/sq and 161Mhz, total latency is (1/161)*1000*8 = 49.7ns.
Select the design with the highest estimated performance as well as any designs within 3ns of that result.
Execute these designs on AWS F1 using the available (granular) clocking. Measure performance and functional correctness of 1B repeated squarings. The clock frequencies available natively from AWS F1 are documented here: https://github.com/aws/aws-fpga/blob/master/hdk/docs/dynamic_clock_config.md.
If this produces a clear winner stop. This is the most likely outcome.
It is possible to have a result from #4 where the interaction of F1 auto frequency scaling and the available granular clock inverts the ordering between two designs. In this situation we will endeavor to run affected designs with a more precise MMCM generated clock running at the SDAccel auto-scaling recommended frequency on either AWS F1 or a standalone VCU118 board to determine the winner. In the event that this is not practical the winner will be determined based on the auto-scaling recommended frequency.

To further illustrate the scenario described in #4 consider the following outcome with designs A and B.

This table shows how using the estimated frequency from auto scaling design A wins with a lower overall latency:

However when you apply the available granular clocking options the results flip and B wins:

Executing at the estimated maximum frequency resolves this issue.