Commits

Stan Seibert committed c20b947 Draft

Add more blocks to keep sm_30 devices busy and pick the correct # of processors per multiprocessor.

Comments (0)

Files changed (1)

 
 int main()
 {
-  const int blocks = 60;
+  const int blocks = 240;
   const int threads_per_block = 256;
   
   const int nevents = blocks * threads_per_block * 10;
   struct cudaDeviceProp prop;
   cudaGetDeviceProperties(&prop, 0);
   int proc_per_multiproc = 8;
-  if (prop.major == 2) proc_per_multiproc = 32;
+  if (prop.major == 2)
+    proc_per_multiproc = 32;
+  else if (prop.major==3) 
+    proc_per_multiproc = 192;
   printf("Device name: %s\n", prop.name);
   // Bogus normalization metric
-  float bogogflops = 2 * prop.clockRate * prop.multiProcessorCount * proc_per_multiproc / 1e6;
+  float bogogflops = 2 / 1e6 * prop.clockRate * prop.multiProcessorCount * proc_per_multiproc;
   printf("BogoGFLOPS: %1.1f\n\n", bogogflops); 
 
   // Allocate arrays
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.