CS663 | computer vision
  • outline
  • projects
  • syllabus
  • links
Main
Goals
Data Input
Research
Getting Started
Proposal

Implementation
Requirements AND Data Output

Deliverables
Other

Project 2: Mobile Imaging Research

You are going to propose a Mobile Imaging Application that in some way will ASSIST One of the following groups:

  • a blind or low-vision (not completely blind) person

  • senior citizen

  • someone with reduce physical mobility

  • an animal

  • children

  • athletes/health

 

 

Example: Consider the case of a blind person ---. It is up to you to think of this application (i.e. an app that tells you what is the salt and what is the pepper shaker).

Think out of the box --but, make it doable (this is hard).....even if you solve the problem paritally I may be happy ---talk to me.

  • salt and peper shaker recognition or how about in coffee shop --cream versus milk, sugar versus substitute?
  • country road walking --- follow the white line on side of road
  • city road walking ---stay on sidewalk
  • Help! Get me out of Here App ---- and entrance or door finding in a room app --- tell user straight ahead to left or right of scene.

      doorspaper on door++ detection



  • label reading in a grocery store -- even the start of this ---recognizing OCR characters and then (someone else could use text to speech to read them out).
  • what about a low vision person ---what could you do to help them out ?
  • Traffic Light Crossing App --- here a low vision or blind person comes to intersection without the crossing audio beeps/sound to say it is safe to cross. User can point up camera to intersection and using video stream your app will find the traffic light and tell user to hold phone still and then wait for NEXT green light and tell user to walk. (next green light so not telling user to walk at end of a green light)

      traffic light




Application Specifications

Every proposal will most likely propose a different mobile imaging application. You will submit a proposal that will be approved or the instuctor will make suitable suggestions for improvement (simplication or possibly more capabilities will be required). Because the purpose of the application can widly varry I am only making some general requirements as listed below.

  • Specify percisely which group (see above) you intend to assist and how --give an example scenario(s).
  • You must as part of the program take pictures (one or more) using the phone's camera either automatically or with the assistance of the user.
  • You need to have some kind of visual response (even if temporarily) showing the photo(s) taken.
  • You need to have some kind of visual reponse showing results of your application ( examples not for you to necessarily replicate- web search listing response for product that has the bar code took picture of, could be a movie/slideshow showing pieced together images taken, could be some image processing results, could be some kind of upload feature, could be some kind of security pass, etc).
  • (part of proposal and final project turn in ) You will need to define buisness model in terms of audience (demographics-age, income, location\,tc) and projected cost of application
  • There must be a recognition layer that involves some kind of Deep Learning Network (i.e. CNN or LSTM, etc).
  • (part of proposal and final project turn in) You will need to define use case diagram and description of your system. This needs to include prototypes of interface as would be seen and experienced by user along with hypothetical results.



    Research:

    1) Ideas - come up with one or at most two ideas that you might like to implement (e.g. iris detector on a mobile device).

    2) 5 papers - for the idea(s) you came up with in #1, find a minimum of 5 papers you are going to post to the Blackboard Discussion board for Project 3 Research Postings.

    3) pick 5 papers posted by someone else and review them and give your opinion of what is going on.

     

    Proposal:

    before beginning this option/project you must submit and proposal which must include the following sections

    • Concept Summary - a few paragraphs on purpose of Android Imaging Application

    • Audience - demographics of intended audience.

    • Application Cost and Projected success - tell me what you will charge for this application (99 cents or 4.99 or what?) and why it will be successful for your audience---why are they going to buy it.

    • Interface Mockups - drawings (hand drawing is okay if readable) of interfaces seen by user as they use the application --- you should have more than one as the application must do something

    • Use Case - Diagram and Description of use of application

    • References - any (ideally online or electronic you can attach) references you used.

    • Image Processing Routines - brief description of what kinds of image processing /algorithms you will do on the one or more images your application will take (possibly with assistance of user). Note that you can not simply display the image taken or say even one that has a single image processing routine done on it (unless you can convince me of some powerful business use for it that does not currently exist)---this is way too trivial of an application. You need to think about the business part and try to come up with an idea that you think will sell. YOU MUST specify what the Deep Learning Network will do in your recognition layer --what are the inputs and outputs.

    Implementation

    Make sure you choose a later (4.0 or on) Android SDK so that you can capter the latest new features. Note later versions of Android SDK and emulator do not let you capture your laptops camera picture for testing in emulation mode!!!!


 

 

RANDOM Thoughts......

  1. What about Seizure detection/prediction by facial monitoring??? (data?) [health]
  2. What about towards the AI companion for the elderly/dogs --detecting boredom and offering games/interaction [seniors/animals]
  3. What about detecting boredom in children while learning --and offering something new? [kids]
  4. How to help a dog communicate with owners their emiotions? [animals]
  5. Detecting obstacles for dogs with low vision and tell them what it is (how?) [animals]
  6. Some kind of elaborate - find my X app using computer vision for the elderly who forget where something is --you monitor where they put [blind]
  7. Pain Detection and classification (data???) [health]
  8. Disabled expressing emotions/ exercising whatever (facial / body emotion detection) [health]
  9. XXXX Prevention (like fighting or picking nose or whatever) -teaching children how to behave (train with internet images) Detecting fights on the playground -big brother is watching you.
  10. See this paper ------Who Let The Dogs Out? Modeling Dog Behavior From Visual Data ----can we use this encoder/decoder idea to see what children would do in a visual enviornment for X (gaming/learning/sports) to then autogenerate sequences/ new paths to take for a different child or ???
  11. RESERVED (ask me about this project--I am looking for a new master's thesis around this issue): Project into Paster's thesis - Rover: training by collecting data from mobile phone mounted on guide dog's back following what it does to help its owner navigate. Label X second/ X consequitive frames of video as directional movement (Forward, Back, Left, Right, Left-Forward, Left-Back, Right-Forward, Right-Back) and then train a LSTM or CNN???? Then direct a remote car/vehicle with mounted mobile phone to head in whatever direction given incomming video. Question: Is there anyway an encoder-decoder model could be useful where the output is the next (x) frame of image synthesized and then use traditional motion estimation to tell the rover what direction to travel. Question: how do we get someone to let us collect data ---we would need hours of data and most importantly in many different enviornments. See motivation --- K. Ehsani, H. Bagherinezhad, J. Redmon, R. Mottaghi, A. Farhadi, "Who let the dogs out? Modeling dog behavior from visual data", IEEE Conference on Computer Vision and Pattern Recognition, 2018. https://pjreddie.com/media/files/papers/1803.10827.pdf

 

 

CS6825

 

CS6825

cs663:computer vision

  • home
  • outline
  • projects
  • syllabus
  • links