Project

# Title Team Members TA Documents Sponsor
13 Epicast: Augmented Board Game
Di Wu
Jianli Jin
Yueshen Li
Zihao Zhu
design_document1.pdf
design_document2.pdf
final_paper1.pdf
final_paper2.pdf
other1.pdf
proposal1.pdf
Yushi Cheng
# RFA


## Team Member

- Yueshen Li, yueshen7
- Jianli Jin, jianlij2
- Di Wu, diw10
- Zihao Zhu, zihao9


## Title

Epicast: Augmented Board Game

## Problem

Dungeons & Dragons (D&D) is a game that thrives on the breadth of imagination and the depth of interaction. It empowers players to construct elaborate worlds and characters that stem from the vast expanse of their creativity. This unfettered freedom not only offers a canvas for creation and interaction but also makes each gameplay experience profoundly personal and distinct. However, the richness of these imagined scenes and environments is often bottlenecked by the need for verbal expression and lacks an intuitive, sensory display. This limitation hampers the visualization of the game's full potential, presenting a hurdle for some players and constricting the game's broader allure. In addition, the Dungeon Master (DM) which is one of the players serving as the game's narrative architect, is often burdened with extensive preparatory work, juggling game mechanics with storytelling, which can be arduous and time-consuming.


## Solution Overview

"Project Epicast" is a comprehensive system designed to enhance the Dungeons & Dragons gaming experience. Central to this system is a GPT-powered AI that serves as an automated Dungeon Master, guiding gameplay with intelligent narrative creation and player interaction.

The visual aspect is handled by an overhead projector, capable of displaying intricate game scenes, animations, and simulating actions such as dice rolls directly onto the gaming surface. Its height is adjustable for optimal image quality. Gesture recognition is enabled through a sophisticated camera, allowing for intuitive control and the ability to capture memorable moments. Audio immersion is provided by integrated speakers and microphones for voice commands, narrative flow, and dynamic sound effects. Completing the sensory experience are ambient lights that adjust to the game's mood, providing synchronized lighting effects.

Lastly, while our focus is on enhancing the D&D experience, it's important to recognize that our board game experience-augmenting module has widespread applications. The demand for immersive, enriched board game interactions extends beyond D&D. Our target example of D&D serves as an ideal prototype for a broader market of board games seeking similar enhancements for a more engaging player experience.

## Solution Components

### [Real-time Data Processing System]

Real-time data processing system is used to capture, process and generate data in Real time. It consists of a data transmission module, a sophisticated camera, a projector, an integrated speaker and microphone, and an ambient light set.

- Data transmission module is responsible for transmitting the data obtained from sensors to the processing unit of the system. And it should be responsible for transmitting the next instructions of sensors from processing unit and some data that sensors need to express. This can be done through a wired or wireless connection.

- Camera is the core component of the real-time data processing system, used to capture image or video data. Those data will contain vital information like people’s hand gestures. The camera can be an ordinary USB camera, or it can be a high-resolution, high-speed industrial-grade camera.

- Projector is another core component of the real-time data processing system. Projection is a technique for projecting an image or video onto an object or plane. In our setting, the projector needs to project images on a wall or slab to inform players of the game status. And the processing unit should send those related data to projector in real time. Same as Camera, the projector can be an ordinary USB camera, or it can be a high-resolution, high-speed industrial-grade camera.

- Microphone will gather the voice of people and transform them into digital audio, and speaker will transform the digital data sent by processing unit to real voice. An intergrated speaker and microphone will deal with voice input and output of our whole model.

- Ambient light set is used to provide ambient light source, provide sufficient lighting conditions that can adjust to the game's mood. So that the light set will provide synchronized lighting effects to enhance the player’s experience. Also, the light set may help to improve the visibility and recognition ability of the image according to the camera feedback and improve the recognition accuracy and accuracy of the real-time recognition system.



### [GPT-Core DM System]

GPT-Core DM system acts as an assistant to Dungeon Masters (DM), providing support and assistance during gameplay. Through modeling and data training, GPT-Core as Dungeon Masters assistant should perform the following some basic functions and complete the game:

- **Generate adventure missions and plot**: The DM can provide some key information to GPT-Core, such as mission type, location, character, etc., and GPT-Core can then generate a complete adventure mission with a reasonable game plot including enemies, puzzles, rewards, etc.

- **Generate player’s character and NPC (Non-Player Character)**: The DM can use GPT-Core to ask players’ requirement of their willing characters, and then generate out their corresponding characters with balanced properties such as their backstory, personality traits, goals, etc. Also, GPT-Core can generate and provide detailed information of many NPCs easily to enhance game quality.

- **Generate other detailed information**: DM can use GPT-Core to generate any detailed information, for an environment such as the layout of the room, decoration, smell, etc. And GPT-Core can generate vivid descriptions of the environment, allowing the player to better engage with the game world. Also, for interaction of player’s character and NPCs, GPT-Core may help to provide detailed descriptions of NPCs’ reactions according to player’s operations. That will make user immerse into game better.



### [Sensor Assistance System]

Most of the time, we need to adjust projector orientation to let it project the screen to place we want or the camara’s orientation to make players’ hand gesture be captured. Design in mechanical engineering involves the physical structure and installation of the projector and the camera. Here are some common design considerations:

- **Mounting bracket**: To securely mount the projector and camera in the desired position, a suitable bracket or mounting bracket needs to be designed. These brackets should be able to adapt to different mounting environments and provide adjustable features to fine-tune projection angles and positions.

- **Adjustment mechanism**: To facilitate adjustment and alignment of the projector and camera, adjustable mechanisms such as rotation and tilt mechanisms can be designed to adjust under different projection angles and positions.

- **Cooling system**: Projector and camera, especially projector, will generate heat during operation, so an effective cooling system may be necessary to be designed to ensure the stable operation of the projector and prevent overheating.

- **Dust and protection**: To protect the projector and camera from dust, moisture and other external factors, it is necessary to design appropriate dust and protection measures, such as filters, seals, etc.

- Other possible small mechanisms can be provided further to assist the rest sensors...


## Criterion for Success

- Hand gesture and audio detections based on AI model are applied to depict players’ action.
- Split game sense projector region and gesture detect region which are supposed to be a multi-module integrated hardware system ought to improve game experience.
- Create a Dungeon Master AI using GPT as a core.
- Create a better experience with a free-moving projector and several ambient lights.

## Distribution of Work

- Yueshen Li will work on identification module and real-time signal transition system

- Jianli Jin will work on GPT-DM model building and data training

- Di Wu will work on hardware appliance construction and data logging system

- Zihao Zhu will work on realistic design and extension module combination.

A Wearable Device Outputting Scene Text For Blind People

Hangtao Jin, Youchuan Liu, Xiaomeng Yang, Changyu Zhu

A Wearable Device Outputting Scene Text For Blind People

Featured Project

# Revised

We discussed it with our mentor Prof. Gaoang Wang, and got a solution to solve the problem

## TEAM MEMBERS (NETID)

Xiaomeng Yang (xy20), Youchuan Liu (yl38), Changyu Zhu (changyu4), Hangtao Jin (hangtao2)

## INSTRUCTOR

Prof. Gaoang Wang

## LINK

This idea was pitched on Web Board by Xiaomeng Yang.

https://courses.grainger.illinois.edu/ece445zjui/pace/view-topic.asp?id=64684

## PROBLEM DESCRIPTION

Nowadays, there are about 12 million visually disabled people in China. However, it is hard for us to see blind people in the street. One reason is that when the blind people are going to the location they are not familiar with, it is difficult for blind people to figure out where they are. When blind people travel, they are usually equipped with navigation equipment, but the accuracy of navigation equipment is not enough, and it is difficult for blind people to find the accurate position of the destination when they arrive near the destination. Therefore, we'd like to make a device that can figure out the scene text information around the destination for blind people to reach the direct place.

## SOLUTION OVERVIEW

We'd like to make a device with a micro camera and an earphone. By clicking a button, the camera will take a picture and send it to a remote server to process through a communication subsystem. After that, text messages will be extracted and recognized from the pictures using neural network, and be transferred to voice messages by Google text-to-speech API. The speech messages will then be sent back through the earphones to the users. The device can be attached to glasses that blind people wear.

The blind use the navigation equipment, which can tell them the location and direction of their destination, but the blind still need the detail direction of the destination. And our wearable device can help solve this problem. The camera is fixed to the head, just like our eyes. So when the blind person turns his head, the camera can capture the text of the scene in different directions. Our scenario is to identify the name of the store on the side of the street. These store signs are generally not tall, about two stories high. Blind people can look up and down to let the camera capture the whole store. Therefore, no matter where the store name is, it can be recognized.

For example, if a blind person aims to go to a book store, the navigation app will tell him that he arrives the store and it is on his right when he are near the destination. However, there are several stores on his right. Then the blind person can face to the right and take a photo of that direction, and figure out whether the store is there. If not, he can turn his head a little bit and take another photo of the new direction.

![figure1](https://courses.grainger.illinois.edu/ece445zjui/pace/getfile/18612)

![figure2](https://courses.grainger.illinois.edu/ece445zjui/pace/getfile/18614)

## SOLUTION COMPONENTS

### Interactive Subsystem

The interactive subsystem interacts with the blind and the environment.

- 3-D printed frame that can be attached to the glasses through a snap-fit structure, which could holds all the accessories in place

- Micro camera that can take pictures

- Earphone that can output the speech

### Communication Subsystem

The communication subsystem is used to connect the interactive subsystem with the software processing subsystem.

- Raspberry Pi(RPI) can get the images taken by the camera and send them to the remote server through WiFi module. After processing in the remote server, RPI can receive the speech information(.mp3 file).

### Software Processing Subsystem

The software processing subsystem processes the images and output speech, which including two subparts, text recognition part and text-to-speech part.

- A OCR recognition neural network which is able to extract and recognize the Chinese text from the environmental images transported by the communication system.

- Google text-to-speech API is used to transfer the text we get to speech.

## CRITERION FOR SUCCESS

- Use neural network to recognize the Chinese scene text successfully.

- Use Google text-to-speech API to transfer the recognized text to speech.

- The device can transport the environment pictures or video to server and receive the speech information correctly.

- Blind people could use the speech information locate their position.