GPU L4: Computation

Name: GPU L4: Computation
Uploaded: Feb 5, 2021
Duration: 3287 s

HPC Education5.54K subscribers

4.2K views

Feb 5, 2021

54:47

00:07:48.311,00:07:51.311 KASHYAPI SHUBHAM BHALCHANDRA mm16b027: If the number of threads is equal to the size of the array, why do we need the id lessthan alen condition? 00:10:56.760,00:10:59.760 rahul verma: do we have any ordering among the 32 threads? 00:12:43.058,00:12:46.058 Buddhavarapu Venkata Surya Sudheendra cs18b006: Sir why dont we have unsigned int but just unsigned 00:18:46.169,00:18:49.169 Suraj Kumar Rai cs20m068: is sir audio is breaking?? 00:18:51.747,00:18:54.747 BIKASH KUMAR BEHERA cs19m019: no 00:19:20.750,00:19:23.750 Kaushal Kiritbhai Kapadiya cs20m029: yes 00:22:15.304,00:22:18.304 KASHYAPI SHUBHAM BHALCHANDRA mm16b027: Sir why don't we need cudaDeviceSynchronize() between init and add? Shouldn't the execution of init be completed before add is called? 00:26:45.268,00:26:48.268 SANKET NEEMA cs19m055: sir which Book ? 00:28:33.006,00:28:36.006 Shouvick Mondal: Programming Massively Parallel Processors, by Kirk and Hu. 00:29:46.637,00:29:49.637 Rigved Sah cs20m053: What is Screaming Multiprocessor? 00:30:00.190,00:30:03.190 Rigved Sah cs20m053: Streaming* 00:34:52.585,00:34:55.585 Shouvick Mondal: Geometrically, the threads are organized in 9 dimesnsions? Banavath Tarun cs17b102: 1 Aditya Agrawal CS20S026: 1 Nistala Krishna Vamsi ee20s025: 1 Bolloju Sai Nitish Kumar cs20m018: 1 K Sampreeth Prem cs17b013: 1 Jeswant Krishna ae17b030: 1 Rigved Sah cs20m053: 1 Purandare Chinmay Prashant ee17b062: 1 Harshit Kedia cs17b103: 1 Patlolla Bharath Simha Reddy cs18b034: 1 00:39:53.546,00:39:56.546 Banavath Tarun cs17b102: 2*3*4*6*7 00:40:12.543,00:40:15.543 BIKASH KUMAR BEHERA cs19m019: 2*3*4*6*7 00:40:18.190,00:40:21.190 Nishant Prabhu me17b084: 2*3*4*6*7 00:40:22.181,00:40:25.181 Mamilla Sai Yashwanth cs18b027: 2*3*4*6*7 00:40:26.906,00:40:29.906 P Sai Venkat Kushal ee17b141: 2*3*4*6*7 00:40:28.423,00:40:31.423 Piyush Avinash Chincholikar cs20m045: 2*3*4*6*7 00:40:29.355,00:40:32.355 K Sampreeth Prem cs17b013: 2*3*4*6*7 00:40:31.225,00:40:34.225 Patlolla Bharath Simha Reddy cs18b034: 2*3*4*6*7 00:40:32.325,00:40:35.325 Anumala Venu Madhava Reddy cs18b051: 2*3*4*6*7 00:40:32.588,00:40:35.588 Sumit Negi cs20m067: 2*3*4*6*7 00:40:33.539,00:40:36.539 Nibedita Behera CS20S023: 2*3*4*6*7 00:40:33.766,00:40:36.766 Abishek S ee18b001: 1008 00:40:35.053,00:40:38.053 Mohammed Shan P S cs20m039: 2*3*4*6*7 00:46:42.260,00:46:45.260 ee20s136 Rahual G S: int i = threadIdx.x; int j = threadIdx.y; arr[i][j] = i * blockDim.y + j; 00:47:07.516,00:47:10.516 P Sai Venkat Kushal ee17b141: __global__ void(int *matrix){ matrix[threadId.x*M + threadId.y] = threadId.x*M + threadId.y } 00:47:10.140,00:47:13.140 BIKASH KUMAR BEHERA cs19m019: __global__ void init (int *garr){ if (threadIdx.x == M) if(threadIdx.y = N) *(garr+ threadIdx.x * N + threadIdx.y) = threadIdx.x * threadIdx.y - 1; } 00:47:21.486,00:47:24.486 Aditya Agrawal CS20S026: __global__ void initKernel(int *arr,int M,int N){ int i = threadIdx.x; int j = threadIdx.y; if((i lessthan M) && (j lessthan N)){ arr[i*blockDim.y+j] = i*blockDim.y+j; } } 00:47:29.374,00:47:32.374 Banavath Tarun cs17b102: unsigned int idx = threadIdx.x*blockDim.y + threadIdx.y; matrix[idx] = idx; 00:47:38.439,00:47:41.439 BIKASH KUMAR BEHERA cs19m019: sorry those if conditons wont be there 00:47:39.668,00:47:42.668 Bolloju Sai Nitish Kumar cs20m018: int x = threadIdx.x, y = threadIdy.y; if(x lessthan N && y lessthan M) matrix[x][y] = x*M+y; 00:47:44.490,00:47:47.490 Mohammed Shan P S cs20m039: global void init(unsigned *m) { unsigned i = threadIdx.x * blockDim.y + threadIdx.y; m[i] = i; } 00:47:45.898,00:47:48.898 Nistala Krishna Vamsi ee20s025: __global__ void dKernal(int* a, unsigned n, unsigned m) { unsigned int id = threadIdx.x + blockDim.x*threadIdx.y; if(id lessthan n*m) a[id] = id; } 00:48:37.011,00:48:40.011 rahul verma: arr[threadidx.*m+threadidx.y]=threadidx.*m+threadidx.y; 00:48:45.152,00:48:48.152 Nibedita Behera CS20S023: int id = threadId.x*blockDim.y + threadId.y; matrix[id] = id; 00:49:04.671,00:49:07.671 SHETH DEV YASHPAL CS17B106: __global__ void dkernel(int *a){ a[threadIdx.y*blockDim.x+threadIdx.x] = threadIdx.y*blockDim.x + threadIdx.x; } 00:50:02.428,00:50:05.428 rahul verma: index of the matrix element 00:50:11.662,00:50:14.662 P Sai Venkat Kushal ee17b141: sir can we keep matrix as two dimensional here? 00:50:23.137,00:50:26.137 P Sai Venkat Kushal ee17b141: in gpu 00:50:46.643,00:50:49.643 P Sai Venkat Kushal ee17b141: ok got it 00:50:51.761,00:50:54.761 Aditya Agrawal CS20S026: Sir when using cudaMalloc, the docs sometimes pass the array as (void **) . Is it necessary? 00:51:33.449,00:51:36.449 Aditya Agrawal CS20S026: Ok thank you sir 00:53:05.315,00:53:08.315 Dhruv Gopalakrishnan ae17b004: u need a compatible gcc 00:54:44.123,00:54:47.123 Bolloju Sai Nitish Kumar cs20m018: Sir grading is absolute?

Download

0 formats

No download links available.