Abdullahrasheed45 commited on
Commit
413bdc4
·
verified ·
1 Parent(s): 7b4c8a5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -11
README.md CHANGED
@@ -1,13 +1,51 @@
1
- ---
2
- title: AI Multimodal Web GPU Assistant
3
- emoji: 🌖
4
- colorFrom: purple
5
- colorTo: red
6
- sdk: gradio
7
- sdk_version: 6.1.0
8
- app_file: app.py
9
  pinned: false
10
- license: mit
11
- ---
 
 
 
 
 
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ title: Ministral WebGPU
2
+ emoji: ⚡️
3
+ colorFrom: red
4
+ colorTo: yellow
5
+ sdk: static
 
 
 
6
  pinned: false
7
+ license: apache-2.0
8
+ short_description: Frontier multimodal AI, running entirely in your browser.
9
+ app_build_command: npm run build
10
+ app_file: dist/index.html
11
+ models:
12
+ - mistralai/Ministral-3-3B-Instruct-2512-ONNX
13
+ - mistralai/Ministral-3-3B-Instruct-2512
14
 
15
+
16
+ AI Multimodal WebGPU Assistant
17
+ Developer: Muhammad Abdullah Rasheed Research Assistant @ Cambridge | MSc Data Science & AI '25 | Google WTM Scholar
18
+
19
+ Overview
20
+ This project demonstrates cutting-edge browser-based AI by running a complete 3B parameter multimodal language model entirely client-side using WebGPU acceleration. No servers, no API calls, no data sent anywhere - complete privacy and instant inference.
21
+
22
+ Key Features
23
+ Privacy-First Architecture: The entire Ministral-3B model runs locally in your browser using WebGPU - your video feed never leaves your device
24
+ Real-Time Multimodal AI: Live camera feed processing with visual question answering capabilities
25
+ WebGPU Acceleration: Leveraging the latest browser GPU APIs for near-native performance
26
+ Zero Backend Dependencies: No API keys, no server calls, no external services required
27
+ Cross-Platform: Works seamlessly across modern browsers with WebGPU support
28
+ Technical Stack
29
+ Model: Ministral-3-3B-Instruct (quantized for browser deployment)
30
+ Runtime: Transformers.js for in-browser inference
31
+ Compute: WebGPU API for GPU acceleration
32
+ Frontend: Modern JavaScript with WebAssembly integration
33
+ Use Cases
34
+ Visual question answering from live camera feed
35
+ Real-time scene understanding and description
36
+ Privacy-sensitive AI applications
37
+ Edge computing demonstrations
38
+ Educational tool for AI and browser technologies
39
+ Why This Matters
40
+ This project showcases the future of AI deployment - moving powerful language models from cloud servers to the edge, where they can provide instant, private, and accessible intelligence without compromising user privacy or requiring expensive infrastructure.
41
+
42
+ Author
43
+ Muhammad Abdullah Rasheed
44
+ Research Assistant | AI & Machine Learning Researcher
45
+
46
+ 🎓 MSc Data Science & AI '25, Google WTM Scholar
47
+ 🔬 Research areas: Computer Vision, NLP, Climate AI
48
+ 💼 Experience: Gesture Recognition, Backend Development, ML Engineering
49
+ 🔗 LinkedIn | GitHub | HuggingFace
50
+ License
51
+ Apache-2.0