GPU L6: Computation
00:00:43.542,00:00:46.542 Shibobrota Das cs20m059: AMD 00:00:46.476,00:00:49.476 Rigved Sah cs20m053: Nvidia 00:00:46.860,00:00:49.860 Nishant Prabhu me17b084: nvidia, amd 00:00:55.154,00:00:58.154 Nishitha R ee20s064: NVIDIA, AMD 00:00:55.506,00:00:58.506 Rigved Sah cs20m053: I tel 00:01:03.534,00:01:06.534 Rigved Sah cs20m053: Intel 00:01:04.996,00:01:07.996 Sheera Shamsu CS20D001: Intel 00:04:47.727,00:04:50.727 SANKET NEEMA cs19m055: what are accelerators? 00:04:55.033,00:04:58.033 P Sai Venkat Kushal ee17b141: sir your presentation stopped 00:04:57.099,00:05:00.099 Shibobrota Das cs20m059: Screen sharing stopped for everyone? 00:05:04.254,00:05:07.254 Nishant Prabhu me17b084: Yesp 00:05:44.824,00:05:47.824 Shouvick Mondal: ok 00:07:14.508,00:07:17.508 Aditya Agrawal CS20S026: Sir, again the presentation stoppped. 00:08:15.660,00:08:18.660 Nistala Krishna Vamsi ee20s025: sir why not use opencl instead of cuda 00:13:35.988,00:13:38.988 Shouvick Mondal: CUDA is thoroughly tested. 00:19:57.197,00:20:00.197 Rigved Sah cs20m053: Is 20 MB for each SMX? 00:20:37.630,00:20:40.630 Nishant Prabhu me17b084: 8 GB to 16 GB 00:21:02.617,00:21:05.617 Pachipulusu Jaitesh cs17b021: 128? 00:26:51.369,00:26:54.369 Vignesh S me17b078: china ? Aditya Agrawal CS20S026: Japan? Arihant Samar cs18b052: Fugaku Mohit Singla cs17b113: japan 00:26:56.434,00:26:59.434 Akash Haridas ae17b020: Japan Nibedita Behera CS20S023: Japan. P Sai Venkat Kushal ee17b141: america? Mansi Choudhary ee17b053: japan Shouvick Mondal: Chennai 36? 00:29:23.185,00:29:26.185 Rigved Sah cs20m053: 62 00:29:43.836,00:29:46.836 Rigved Sah cs20m053: Param Pachipulusu Jaitesh cs17b021: time for square cpu function call Akash Haridas ae17b020: wall time 00:33:53.178,00:33:56.178 ROHITH SRINIVAAS M mm16b008: Compile time + Execution time P Sai Venkat Kushal ee17b141: starting of function call to function completion Buddhavarapu Venkata Surya Sudheendra cs18b006: maximum 00:38:58.464,00:39:01.464 Nistala Krishna Vamsi ee20s025: average Rigved Sah cs20m053: Average Sumit Negi cs20m067: average time Shouvick Mondal: Geomean 00:39:01.474,00:39:04.474 Shubham Mohan Randive cs20m064: average Nishant Prabhu me17b084: Run for some number of iterations and average KASHYAPI SHUBHAM BHALCHANDRA mm16b027: average Nishitha R ee20s064: Average of many runs. Mr Rahul Mastram Verma CS20S038: AVG KARTHIK SURESH ee16b140: median? KASHYAPI SHUBHAM BHALCHANDRA mm16b027: outer two loops? 00:41:43.247,00:41:46.247 ROHITH SRINIVAAS M mm16b008: i loop 00:41:44.874,00:41:47.874 SHETH DEV YASHPAL CS17B106: parallelize over loop variables i and j 00:41:47.014,00:41:50.014 Patlolla Bharath Simha Reddy cs18b034: i and j 00:41:51.204,00:41:54.204 Rigved Sah cs20m053: I loop 00:42:02.540,00:42:05.540 BIKASH KUMAR BEHERA cs19m019: outer two loops 00:42:02.961,00:42:05.961 Shubham Mohan Randive cs20m064: i and j 00:44:04.252,00:44:07.252 Shagnik Pal ee17b147: row major access? Caches are row major 00:44:07.491,00:44:10.491 P Sai Venkat Kushal ee17b141: both are the same unless there is space caching 00:44:56.682,00:44:59.682 Aditya Agrawal CS20S026: Sir, could you repeat why we couldnt parallelize the kk loop? 00:46:45.766,00:46:48.766 Aditya Agrawal CS20S026: Got it. Thanks 00:49:22.710,00:49:25.710 P Sai Venkat Kushal ee17b141: overhead? Shagnik Pal ee17b147: The overheads don't make up for the parallelism 00:49:26.907,00:49:29.907 Nistala Krishna Vamsi ee20s025: less value of matrix size 00:49:29.591,00:49:32.591 Akash Haridas ae17b020: loops run slower on GPU? 00:49:31.647,00:49:34.647 Nishant Prabhu me17b084: N is too small for any benefit Amalan S EE20D408: cudaMemCpy 00:49:49.674,00:49:52.674 Vignesh S me17b078: what is overhead ? 00:50:13.800,00:50:16.800 Pachipulusu Jaitesh cs17b021: frequency of gpu cores is also low, approx half 00:50:55.383,00:50:58.383 Pachipulusu Jaitesh cs17b021: 3.5-4 00:50:56.430,00:50:59.430 Shibobrota Das cs20m059: 4.1 ghz 00:50:56.504,00:50:59.504 Nishant Prabhu me17b084: 2 ghz P Sai Venkat Kushal ee17b141: 3ghz Akash Haridas ae17b020: 3-4GHz SHETH DEV YASHPAL CS17B106: 2.5 GHz KASHYAPI SHUBHAM BHALCHANDRA mm16b027: 2 GHz Buddhavarapu Venkata Surya Sudheendra cs18b006: 4-5Ghz Nistala Krishna Vamsi ee20s025: 2 ghz Shibobrota Das cs20m059: GPU- 1.4ghz Pachipulusu Jaitesh cs17b021: 1.5-2 P Sai Venkat Kushal ee17b141: 1.2Gh Prasoon Mishra CS20S028: 1.5 - 2.1 Ghz Shouvick Mondal: because of heating problem? Pachipulusu Jaitesh cs17b021: efficiency
Download
0 formatsNo download links available.