Hello,
I have been using the older execute inference API in TensorRT for several years. Due to a recent GPU upgrade, I am now migrating to a newer version of TensorRT.
I noticed that when using the enqueueV3 or enqueueV2 APIs, the tensor address must be specified for each inference call via setTensorAddress.
Could you clarify if calling setTensorAddress on every inference introduces any overhead that might impact overall performance or throughput?
Thank you.