API (Self-hosted on AWS): Sizing
Depending on the EC2 instance type (InstanceType configuration parameter), you select the number of vCPUs available to process requests. If you go with our default instance type m6i.large, two vCPUs are available. The number of requests that attachmentAV handles in parallel is 2*vCPUs, in our case, one m6i.large handles four requests in parallel.
attachmentAV runs EC2 instances in an Auto Scaling Group with a minimum size of two (AutoScalingMinSize configuration parameter). Therefore, two m6i.large can handle eight requests in parallel.
If you send more requests to attachmentAV than the maximum number of requests that are processed concurrently, requests are queued. The maximum length of the request queue can be calculated 6*vCPUs, in our case, one m6i.large queues at most 12 requests. If the request queue is full, attachmentAV returns HTTP responses with status code 429.