Stars
512
Forks
84
Language
C++
Last Updated
Mar 22, 2024
Similar Repos
Repo | Language | Stars | Description | Updated At |
---|---|---|---|---|
C++ | 23 | GPU Matrix Library - A CUDA-based C++ wrapper and syntax sugars for NVIDIA CUBLAS | May 10, 2023 | |
C++ | 16 | A cuda & mkl implementation of closed-form matting | May 09, 2023 | |
Python | 89 | A Fast Muti-processing BERT-Inference System | Aug 28, 2022 | |
C++ | 94 | optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052 | Apr 18, 2023 | |
C++ | 8 | CUDA implementation of Tractable Approximate Gaussian Inference | Mar 08, 2023 | |
Cuda | 7 | A CUDA implementation of SIFT for NVidia GPUs | Apr 18, 2023 | |
Rust | 5 | A lightweight and pleasant Rust wrapper for Intel MKL | Mar 16, 2023 | |
Cuda | 15 | optimized realtime harmonic/percussive source separation using the GPU (NVIDIA CUDA) and CPU (Intel IPP) | Mar 13, 2023 | |
TeX | 26 | Opensource GIS Tool leveraging NVIDIA CUDA and pyCuda for fast Raster Analysis | Mar 27, 2023 | |
C++ | 2 | Parallel implementation of NW algorithms with NVIDIA GPU and CUDA C++ | Mar 11, 2023 | |
C++ | 14 | Old NVIDIA CUDA implementation of salted MD5 brute-force | Apr 26, 2022 | |
C++ | 2 | Heterogeneous Ethereum Miner with support for AMD, Intel and Nvidia GPUs using SYCL, OpenCL and … | Apr 15, 2022 | |
Jupyter Notebook | 2 | Large scale K-means and K-nn implementation on NVIDIA GPU / CUDA | Jul 20, 2022 | |
Jupyter Notebook | 734 | Large scale K-means and K-nn implementation on NVIDIA GPU / CUDA | Apr 27, 2023 | |
C# | 4 | Simple MLP / CNN / RNN / LSTM neural-networks implementation in csharp using Intel-MKL-Library. | Mar 04, 2023 | |
C | 3 | RSVDPACK: Implementations of fast algorithms for computing the low rank SVD, interpolative and CUR decompositions … | Mar 25, 2022 | |
C | 78 | RSVDPACK: Implementations of fast algorithms for computing the low rank SVD, interpolative and CUR decompositions … | Feb 18, 2023 | |
C++ | 203 | MTCNN C++ implementation with NVIDIA TensorRT Inference accelerator SDK | Apr 30, 2023 | |
C | 3 | Compare of Serial Modified Gram-Shmidt, Householder (multi-core CPU MKL) and Givens (GPU CuBlas) QR-decomposition | Apr 22, 2023 | |
Cuda | 13 | This is super fast ambiguity function created for NVIDIA cards with CUDA technology. | Apr 24, 2023 | |
Python | 57 | Code for the paper "BERT Loses Patience: Fast and Robust Inference with Early Exit". | May 01, 2023 | |
C | 5 | OpenCL Precompiler for nVidia, Intel and AMD platforms | Jan 29, 2019 | |
None | 3 | GPUs process monitoring for AMD, Intel and NVIDIA | Jul 21, 2023 | |
Elixir | 4 | NVIDIA GPU CUDA library bindings for Erlang and Elixir. | Dec 16, 2021 | |
None | 2 | Installation guide for NVIDIA driver, CUDA, cuDNN and TensorRT | Nov 03, 2021 | |
C++ | 2 | I wanted to try cublas but it seems to be over 600 times slower then … | Dec 04, 2022 | |
JavaScript | 11 | Intel/Hybrid/NVIDIA GPU Switch and show GPU status | Jun 04, 2022 | |
Python | 36 | Hardware-accelerated DNN model inference ROS2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with … | Jun 20, 2022 | |
Cuda | 2 | This is a neural network frame, implemented through C++ and CUDA libraries, including cuDNN, cuBLAS, … | Feb 14, 2022 | |
JavaScript | 5 | Access NVIDIA CUDA documentation and switch between versions quickly & easily | Mar 28, 2023 | |
C | 2183 | waifu2x converter ncnn version, runs fast on intel / amd / nvidia / apple-silicon GPU … | Aug 15, 2022 | |
Python | 47 | Method to improve inference time for BERT. This is an implementation of the paper titled … | Apr 12, 2022 | |
Shell | 124 | Shadowplay's Replay Feature On Linux For Nvidia, AMD and Intel | Apr 30, 2023 | |
C++ | 1204 | a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU … | Aug 11, 2022 | |
C | 3 | Fast implementation of the neighbour-joining phylogenetic inference method | Aug 31, 2020 | |
Python | 89 | Bert-classification and bert-dssm implementation with keras. | Jun 28, 2022 | |
C | 372 | real-cugan converter ncnn version, runs fast on intel / amd / nvidia / apple-silicon GPU … | Aug 14, 2022 | |
C++ | 9 | Sparse Boolean linear algebra for Nvidia Cuda, OpenCL and CPU computations | Jul 03, 2022 | |
C++ | 11 | Greyscale image using NVIDIA CUDA 5 Toolkit and OpenCV in C++ | Sep 04, 2021 | |
Shell | 2 | Installation script to install Nvidia driver and CUDA automatically in Ubuntu | Jan 24, 2023 | |
Cuda | 11 | Shared memory overlap-and-save method for NVIDIA GPUs using CUDA | Nov 25, 2022 | |
Dockerfile | 2 | Docker image that has intel-mkl installed. Provides debian:buster and ubuntu:bionic based images on docker hub. | Nov 30, 2022 | |
None | 2 | Linux power optimization tutorial for Nvidia, Intel and Ubuntu based distributions. | Jan 18, 2022 | |
Python | 4 | Switch between Intel and Nvidia graphics easily with a Budgie applet | May 05, 2020 | |
Cuda | 12 | GPUCFR is a parallel implementation of Counterfactual Regret Minimization (CFR) in C++ and CUDA C … | Apr 05, 2023 | |
Shell | 2 | scientific python 2.7 with intel MKL built into virtual machines. Docker, AMI, Virtualbox, Vmware, KVM … | Feb 22, 2020 | |
Python | 3 | Fast inference of deepcharuco model using onnx and improved inference setup | Jan 24, 2024 | |
Cuda | 2 | Fast random number generator for C++ and CUDA | Oct 17, 2022 | |
Python | 2 | BERT Inference on CPU with Torch, ONNX Runtime, OpenVINO, and TVM. | Mar 06, 2023 | |
FORTRAN | 28 | a tester for BLAS libraries including OpenBLAS and Intel MKL. This project is based on … | Jun 12, 2021 |