Abdullahrasheed45 commited on
Commit
b274517
·
verified ·
1 Parent(s): a330b47

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -31
README.md CHANGED
@@ -1,3 +1,4 @@
 
1
  title: Ministral WebGPU
2
  emoji: ⚡️
3
  colorFrom: red
@@ -9,43 +10,56 @@ short_description: Frontier multimodal AI, running entirely in your browser.
9
  app_build_command: npm run build
10
  app_file: dist/index.html
11
  models:
12
- - mistralai/Ministral-3-3B-Instruct-2512-ONNX
13
- - mistralai/Ministral-3-3B-Instruct-2512
 
 
14
 
 
15
 
16
- AI Multimodal WebGPU Assistant
17
- Developer: Muhammad Abdullah Rasheed Research Assistant @ Cambridge | MSc Data Science & AI '25 | Google WTM Scholar
 
 
18
 
19
- Overview
20
  This project demonstrates cutting-edge browser-based AI by running a complete 3B parameter multimodal language model entirely client-side using WebGPU acceleration. No servers, no API calls, no data sent anywhere - complete privacy and instant inference.
21
 
22
- Key Features
23
- Privacy-First Architecture: The entire Ministral-3B model runs locally in your browser using WebGPU - your video feed never leaves your device
24
- Real-Time Multimodal AI: Live camera feed processing with visual question answering capabilities
25
- WebGPU Acceleration: Leveraging the latest browser GPU APIs for near-native performance
26
- Zero Backend Dependencies: No API keys, no server calls, no external services required
27
- Cross-Platform: Works seamlessly across modern browsers with WebGPU support
28
- Technical Stack
29
- Model: Ministral-3-3B-Instruct (quantized for browser deployment)
30
- Runtime: Transformers.js for in-browser inference
31
- Compute: WebGPU API for GPU acceleration
32
- Frontend: Modern JavaScript with WebAssembly integration
33
- Use Cases
34
- Visual question answering from live camera feed
35
- Real-time scene understanding and description
36
- Privacy-sensitive AI applications
37
- Edge computing demonstrations
38
- Educational tool for AI and browser technologies
39
- Why This Matters
 
 
 
 
 
 
 
40
  This project showcases the future of AI deployment - moving powerful language models from cloud servers to the edge, where they can provide instant, private, and accessible intelligence without compromising user privacy or requiring expensive infrastructure.
41
 
42
- Author
43
- Muhammad Abdullah Rasheed
44
- Research Assistant | AI & Machine Learning Researcher
 
 
 
 
 
 
 
45
 
46
- 🎓 MSc Data Science & AI '25, Google WTM Scholar
47
- 🔬 Research areas: Computer Vision, NLP, Climate AI
48
- 💼 Experience: Gesture Recognition, Backend Development, ML Engineering
49
- 🔗 LinkedIn | GitHub | HuggingFace
50
- License
51
  Apache-2.0
 
1
+ ---
2
  title: Ministral WebGPU
3
  emoji: ⚡️
4
  colorFrom: red
 
10
  app_build_command: npm run build
11
  app_file: dist/index.html
12
  models:
13
+ - mistralai/Ministral-3-3B-Instruct-2512-ONNX
14
+ - mistralai/Ministral-3-3B-Instruct-2512
15
+ ---
16
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
17
 
18
+ # AI Multimodal WebGPU Assistant
19
 
20
+ **Developer:** Muhammad Abdullah Rasheed
21
+ **Research Assistant @ Cambridge | MSc Data Science & AI '25 | Google WTM Scholar**
22
+
23
+ ## Overview
24
 
 
25
  This project demonstrates cutting-edge browser-based AI by running a complete 3B parameter multimodal language model entirely client-side using WebGPU acceleration. No servers, no API calls, no data sent anywhere - complete privacy and instant inference.
26
 
27
+ ## Key Features
28
+
29
+ - **Privacy-First Architecture**: The entire Ministral-3B model runs locally in your browser using WebGPU - your video feed never leaves your device
30
+ - **Real-Time Multimodal AI**: Live camera feed processing with visual question answering capabilities
31
+ - **WebGPU Acceleration**: Leveraging the latest browser GPU APIs for near-native performance
32
+ - **Zero Backend Dependencies**: No API keys, no server calls, no external services required
33
+ - **Cross-Platform**: Works seamlessly across modern browsers with WebGPU support
34
+
35
+ ## Technical Stack
36
+
37
+ - **Model**: Ministral-3-3B-Instruct (quantized for browser deployment)
38
+ - **Runtime**: Transformers.js for in-browser inference
39
+ - **Compute**: WebGPU API for GPU acceleration
40
+ - **Frontend**: Modern JavaScript with WebAssembly integration
41
+
42
+ ## Use Cases
43
+
44
+ - Visual question answering from live camera feed
45
+ - Real-time scene understanding and description
46
+ - Privacy-sensitive AI applications
47
+ - Edge computing demonstrations
48
+ - Educational tool for AI and browser technologies
49
+
50
+ ## Why This Matters
51
+
52
  This project showcases the future of AI deployment - moving powerful language models from cloud servers to the edge, where they can provide instant, private, and accessible intelligence without compromising user privacy or requiring expensive infrastructure.
53
 
54
+ ## Author
55
+
56
+ **Muhammad Abdullah Rasheed**
57
+ Research Assistant | AI & Machine Learning Researcher
58
+ - 🎓 MSc Data Science & AI '25, Google WTM Scholar
59
+ - 🔬 Research areas: Computer Vision, NLP, Climate AI
60
+ - 💼 Experience: Gesture Recognition, Backend Development, ML Engineering
61
+ - 🔗 [LinkedIn](https://www.linkedin.com/in/muhammad-abdullahrasheed-/) | [GitHub](https://github.com/Abdullahrasheed45) | [HuggingFace](https://huggingface.co/Abdullahrasheed45)
62
+
63
+ ## License
64
 
 
 
 
 
 
65
  Apache-2.0