Unity* 3D Touch GUI Widgets

Intel® Perceptual Computing SDK

Microsoft Windows* 8

↧

GA Tech 2013 Code for Good Student Hackathon

November 10, 2013, 7:04 am

Latest and popular articles on Intel Technologies

≫ Next: Building a Local Developer Community: A Conversation with Intel East Africa SSG's Fredrick Odhiambo

≪ Previous: EducAR - Conheça mais sobre esse app educativo em Realidade Aumentada que foi adaptado para a Perceptual Câmera!

For 24 hours in early November, we held the 2nd GA Tech Code for Good Student Hackathon. In continuation of last year's event here, we retained the theme of teaching healthy lifestyle choices to combat childhood obesity. From edutainment to exercise games, we seek to create worthwhile projects that can help an at-risk demographic: our future.

With Intel providing the food and Android tablets, the students have been working non-stop on these beneficial games. Our host at Georgia Tech is Professor Matthew Wolf. Special guest Cornelia Davis from Pivotal labs joined us to share her expertise on Cloud Foundry, with which the students have hosted and distributed their software.

Variations on the Theme

From the previous hackathon on this subject, domain experts share insights:

From Healthier Generations -

Are there technologies that solve similar problems?
Perhaps you’re inspired by a feature of another piece of technology such as an app on your phone, or an online service. Do you know of other technologies that solve similar problems, or solve a problem in a similar way to what you imagine?
Two apps that do some of the things that we think are important are Instagram* and WebMD*.
Instagram - people can take photos, put them on a map and connect with others through images. In case of childhood obesity, they could take photos and/or map comments about their environment as it relates to access to healthy food and safe places for physical activity.
WebMD* - similar to how WebMD identifies symptoms and treatments, we would like to offer questions about a person’s environment and help them identify solutions in their environment.

From Dr. Marks

1) the most important thing is to get people moving. Hopefully walking, but at least moving. Games that require and reward the kids to actually walk to move the character through the game would be great.
2) Nutrition that not only rates meals, but also allows them to have nutrition information in an understandable format, relevant to school lunches, would also be good. The overwhelming majority of school foods in this country are provided by a single company, so this is do-able. It does need to be fun, or kids won't do it. You can also take advantage of the cameras that most cell phones have these days. Is there any way to photograph a school lunch and cross reference it with the known inventory of the company supplying the food? Could you have some kind of reference item of known shape and size that gets photographed with the food so that portion sizes can be estimated?
3) knowledge is power. Kids that know where their food came from make better choices. How many kids know, for example, that ketchup is mostly high fructose corn syrup? Do they even know what a tomato is?
4) kids do in fact educate and pressure their parents in very meaningful ways. The question is how to build in motivation and reward on both sides.
5) is there any way to turn a standard phone into a pedometer? Can you track how much a child moved so that appropriate rewards can be offered?
6) improving our ability to move through the built environment is key.
There are many map programs that calculate driving routes. Is it possible to calculate the best/safest walking or biking route?

The Teams

Team: "MEM@"

Game: Geocaching mystery game, find virtual clues at physical locations. Encourages physical activity by movement among target places.

The team started work using the Cloud9 IDE, an online collaboration tool, while investigating geolocation, Google Maps API, and MongoDB for use with Cloud Foundry. Unfortunately their lead coder disappeared for a few hours, so the team iterated on high-level design and architecture. Upon his return, they consolidated their work and reached a working prototype. Despite trouble integrating with the server, this team's demo was functional by the end of the event.

Demo was diffiult to film, due to the nature of the game.

Final report out: http://www.youtube.com/watch?v=iEz3q_ibC04

Team: RADD

Game: Multi-device whack-a-mole style monster catching game. Requires physical activity to catch monsters, thus gaining points.

From the start, the team split into pairs working on the frontend and backend, using the Handlebars.js templating engine and MongoDB respectively. The interfaces were completed quickly, followed by the registration/login features, but the team hit a snag with MongoDB authentication issues. As soon as the problems were ironed out, integration went smoothly.

The final demo was as entertaining to watch as it was to play: http://www.youtube.com/watch?v=pTsOadJAsn4

Final report out: http://www.youtube.com/watch?v=TWbOsAFF8kg

Wrap-up

By the end of this event, all involved had enjoyed the time but were ready to rest. The timing was not ideal- being right after GA Tech's homecoming, when pre-Thanksgiving projects are all coming due- and there was another event happening right down the street, but our reduced turnout allowed closer work with the teams. This also resulted in the highest percentage of demo-ready apps per team, as both teams reached that milestone. I look forward to working with GA Tech again next year.

Links

Hackathon announcement: http://cercs-ed.gatech.edu/activities/2013hackathon

GA Tech CERCS live blog: http://cercs-ed.gatech.edu/activities/2013hackathon-liveblog

Cornelia's Cloud Foundry blog coverage: http://blog.cloudfoundry.com/2013/11/14/georgia-tech-hack-for-good-on-cloud-foundry/

Icon Image:

↧

Building a Local Developer Community: A Conversation with Intel East Africa SSG's Fredrick Odhiambo

November 13, 2013, 4:50 am

Latest and popular articles on Intel Technologies

≫ Next: Not built in a day - lessons learned on Total War: ROME II

≪ Previous: GA Tech 2013 Code for Good Student Hackathon

The Intel Software and Services Group (SSG) opened its first office in East Africa in April 2013. This was a big move for Intel, which had previously only had offices in South Africa and Egypt. The local SSG is responsible for working with East African developers to provide them with design tools, resources and expert consulting.

In a bid to find out more about what it means for local Kenyan developers to have the SSG available to them, we chat with Fredrick Odhiambo, an application engineer with Intel East Africa. A graduate of the University of Nairobi, he focuses on the developer space and seeks to enhance local innovation and provide technical expertise support and tools to developers that enable them to create applications with rich user experience on devices running on Intel technology.

Q: Why does Intel care about software? As far as most people/developers are concerned Intel deals solely with hardware - how do you make the connection with software?

A: The Software and Services group works closely with independent software vendors and operating system vendors. Our aim is to enable them to make the most of our hardware by developing software that runs optimally on our platform.

Q: Why should I care about going native as a developer? Android is Android. What does going native even mean?

A: Going native means developing differentiated Android apps using native programming languages like C, C++ or Assembly Language. Native development gives the developer a better, differentiated app that takes advantage of direct CPU and hardware access. Native development is good for performance intensive tasks like signal processing, image manipulation and complex algorithms.

As for numbers from Google Play, 68% of the top 300 free apps on the Play store are native Android apps, and only 32% are Dalvik Android apps. This trend is the same for the top 300 paid Android apps, where 80% are native Android apps.

Q: How many apps have you developed so far?

A: 5 apps, all on different platforms i.e. J2me, Android, Qt and HTML5.

Q: What has been your most successful app? Why?

A: My most successful application was a mobile phone utility for the blind, a free text to speech application based on low end, inexpensive java phones. The application helps visually impaired persons to use their mobile phone utilities with ease, i.e. calculator, SMS, phonebook etc. The application is in Swahili which is the most widely spoken language in East Africa, after English. The app is being used by visually impaired and blind people across the country and has had a huge impact on the lifestyle of the disabled.

Q: You have been seen pushing for the Havok gaming engine. Why would a game developer pay attention to this particular gaming engine?

A: Havok enables developers to create world class gaming apps, and it’s totally free. The cross platform supports Android, iOS and Tizen; and has a published roadmap with regular releases. We are currently running an 8 week fully sponsored training program in Nairobi on how to build apps on the Havok gaming engine.

Q: What are you learning right now?

A: Perceptual computing. Perceptual computing focuses on research and development of technologies that leverage on natural user interactivity enabling users to interact with their PCs in more intuitive, natural engaging ways. Main modes of interaction are speech recognition, finger/hand gestures, face recognition and augmented reality. A typical example is the facial login, no passwords required at all, the system “remembers” the user.

Q: What is important for developers in Africa to know regarding how they can benefit from Intel?

A: Sign up at Intel Developer Zone Africa. This is a global program that enables developers to engage with Intel on topics related to software and is designed to answer today’s software development challenges. The portal gives developers to our latest software development and optimization tools.

Q: Thanks for your time Fred! Any parting shot?

A: Let’s go native!

Intel SSG

East Africa

Kenya

Icon Image:

↧

Not built in a day - lessons learned on Total War: ROME II

November 20, 2013, 6:51 pm

Latest and popular articles on Intel Technologies

≫ Next: Student Hackathon in a Box

≪ Previous: Building a Local Developer Community: A Conversation with Intel East Africa SSG's Fredrick Odhiambo

Download Article

Not Built in a Day – Lessons Learned on Total War: Rome* II [PDF 935KB]

Abstract

The developers at Creative Assembly had a challenge: How could they get Total War: ROME* II to play well across a wide range of Intel® systems without compromising the game aesthetics? This case study details how the game takes best advantage of low-power systems, while still scaling up to look and run fantastic on more robust systems.

High-fidelity landscapes are an essential part of the game’s rich historical environments, and lush foliage is key to those landscapes. The foliage was optimized with Adaptive Order Independent Transparency (AOIT), giving it a rich look with low performance overhead. The team added a game benchmark for an easy way to measure performance. An in-game battery meter allows players to monitor power when playing on the go. They also tuned for several different systems at once to ensure that any optimizations were balanced across systems, and added detection code to automatically set the right options for each system.

The team made improvements to many other areas, including LODs, shadows, landscape generation, particles, CPU tasks, and sound. They also optimized for memory bandwidth.

Together, these gave the game great performance across a wide range of Intel systems.

The challenge

When building Total War: ROME II, Creative Assembly challenged Intel. They wanted to deliver the most immersive experience of any Total War game to date, on all types of systems, with no compromise. But today’s systems have a variety of features and come in a number of form factors. How could they deliver great gaming at fluid frame rates across all of these systems? They turned to Intel’s engineers. Together, we delivered a fantastic game that players can enjoy across the full range of the latest Intel systems, from the most power-thrifty Ultrabook™ up through full power laptop, all-in-one, and desktop systems.

Figure 1. Typical scene

The team put their requirements on a sliding scale, with more capable machines delivering an even faster frame rate with higher resolution and quality settings. Since the game is faster on more capable systems, advanced features are enabled so the game renders with even higher quality. This includes AOIT built on top of the Intel® Iris™ graphics extension for pixel synchronization, which gives systems with Intel graphics much faster transparency calculations.

To make sure that the game is playable across a wide range of systems, it automatically configures itself to match each system. When the system is running on battery, the game displays a battery meter. ROME II also includes a built-in benchmark mode so anyone can see the game’s performance.

The team studied the GPU and CPU performance of the game with Intel® Graphics Performance Analyzers (Intel® GPA), Microsoft GPUView, and Intel® VTune™ Amplifier XE.

As we walk through this case study, you should see similarities with your game development. We hope this case study helps you implement similar features in your game.

Detecting the platform

With code like the GPU Detect code sample, ROME II detects the system’s graphics device. With this information, the game configures itself for the best visual fidelity on each system. On systems with Intel® HD Graphics 4200/4400/4600 (typically systems that use 15W of power), the game defaults to 1366x768 resolution and Medium quality.

Figure 2. Sample scene on Intel® HD Graphics 4600 system

For systems with Intel HD Graphics 5000 and Iris Graphics 5100, the game sets itself to 1600x900 with Medium quality, with increased shadow fidelity.

On systems with Iris™ Pro Graphics 5200, the game defaults to 1600x900 with High quality, using AOIT for even better visual quality.

Figure 3. Sample scene on Iris™ Pro Graphics 5200

Extensive benchmarking proves that the game has great frame rates (>=30 FPS for at least 95% of typical game play) on all these configurations.

Setting the bar with a benchmark

To make it easier to measure the game’s performance across a variety of systems, ROME II includes a benchmark to showcase typical performance across a campaign scenario. Go to the advanced graphics options screen and select “run benchmark”. For simpler benchmarking, it can also be started from the command line. Although the game ships with a single benchmark, it has a benchmark selection screen, so it can use additional benchmarks.

The benchmark doesn’t specifically measure power. If you want to study power during gameplay, run the benchmark and a power-monitoring tool (see Intel® Graphics Performance Analyzers (Intel® GPA) System Analyzer for real-time power measurement).

We recommend you build a benchmark like this (plus a benchmark-running tool), to show your game’s typical performance.

We’ve got the power

To showcase how long the game can last on battery, it includes a battery meter. The meter is hidden if the battery is fully charged and plugged in to AC power. Anytime an Ultrabook or laptop system is running on battery or charging from AC power, the battery meter appears so the player knows how long they can play. The battery meter has been integrated into the screen so that it doesn’t hide any essential information.

Figure 4. Battery meter at the top of the screen

Some games adapt to battery power by reducing resolution and quality settings, or reducing frame rate. These strategies did not give a satisfying gaming experience in ROME II, so the team did not include any specific power optimizations.

As you work on your game, study how it’s using power, and see if some of those optimizations might be right for you.

AOIT makes the vegetation look good

The Total War games are known for their immersive environments, with realistic foliage. This aesthetic requires transparency, and lots of it.

For the game to look its best on Intel graphics hardware, it uses an Intel Iris graphics extension to the DirectX* API. The pixel synchronization extension gives a low-overhead way to synchronize pixel writes via the graphics driver, which accelerates transparency.

The game originally used an Alpha-to-Coverage solution for transparency. The team planned to supplement this with a k-buffer solution in ROME II. They found that the k-buffer works fine for small areas of the screen with a fixed amount of transparency. However, there were problems on a full screen with the levels of overdraw seen in ROME II. It quickly ran out of GPU memory, so it ran far too slowly. AOIT doesn’t suffer from that problem, and is about 5x faster than k-buffer. AOIT also lets the player see more alpha around the edges of the leaves. This gives a better appearance of depth, especially at resolutions well below 1080p.

While a general AOIT algorithm was published a few years ago, a recent code sample details how to accelerate AOIT with pixel synchronization. With pixel synchronization, shaders write colour and depth into an Unordered Access View (UAV) buffer. The farthest colours are blended as they are written. Then, a visibility function (VF) combines the colours with a refactored alpha-blending equation. Together, this gives a deterministic and fast way to figure transparency.

The AOIT pixel synchronization sample was literally “dropped in” to the ROME II code base with few changes. The ROME II version of AOIT has a pre-multiplied alpha and does some culling. It was easy to add lighting to the AOIT pass and let the tree foliage cast its own shadows.

For higher-end systems, AOIT is now the default for transparency. All other configurations use Alpha-to-Coverage.

Figure 5. Vegetation looks great, thanks to AOIT

AOIT operates on the foreground and middle distance vegetation, with great results. While it’s made this screen shot look great, the actual movement in the game is even better. AOIT should be just as easy to include in your game.

Bandwidth optimizations

After studying the game, we felt that reducing bandwidth at the GPU would increase overall performance. With multiple render targets, the game wrote to a single output format, all the same size. Since it’s possible to have different output formats and bit widths on each render target, we changed the game so it picks appropriate formats and sizes for each.

The landscape engine generates textures. It stores surface colour and normal as RGBA 8 textures, even though the alpha channel is unused. A more efficient format can reduce the bandwidth.

The terrain sampler used manual bilinear filtering. The team studied the vertex shader’s profile in Intel GPA Frame Analyzer, and found the shader was too slow. It read 4 heights with a gather4, and then performed a manual bilinear filter of the 4 values. The vertex shader now has a sampler with appropriate filtering, so it’s quicker.

Figure 6. Profiled times for manual bilinear filtering in the terrain vertex shader

We discovered some LOD models were not optimal. For example, the torso animation model is instanced throughout a campaigning army. But Intel GPA revealed those vertex shaders were invoked too often. This is a common issue so we usually check for it. Instancing can be effective when rendering a whole group of objects with the same model using the same mesh. However, when the objects are dispersed throughout the scene with very different Z depths, there are often a large number of objects in the distance. They render to very small parts of the screen, giving lots of sub-pixel polygons. Even though the instancing reduces the number of draw calls, the number of vertex shader calls on those sub-pixel polygons can become very inefficient.

To avoid this, be cautious with Z depth when instancing. For complex models, only instance them if the Z depth of the instance is a fairly close match with the Z depth of the original.

Figure 7. Vertex shader invoked as often as pixel shader, a clear sign of trouble on this instanced mesh

In general, the vertex shader should be invoked much less often than the pixel shader, and any large number of primitives, post-filter texels and reads may indicate the same issue. In this example, the vertex shader was invoked as often as the pixel shader. To fix this, the game switched to a simpler LOD model for distant objects.

Shadows

Shadows revealed a number of issues that were all ultimately improved.

We had an issue with the number of cascades. ROME II had a cascaded shadow map, which allows great detail in close up areas but still maintains a wide area of shadow detail. At first, the game had three cascades that needed depth tuning for the best effect. By carefully placing the division between cascades, we reduced this to two cascades. The shadow effect was still good, but the game ran faster.

Shadow generation and the main rendering pass both had the same vertex signature (11 inputs). This was inefficient since shadow generation didn’t need a large part of that vertex signature. We couldn’t simplify the vertex signature, however. Many components of the scene need alpha or punch-through areas of their textures, so texture processing was necessary for shadow creation. Later in the project, the input signatures were separately reduced (by 3 float4s), giving shadows a small performance gain.

Shadow maps suffered from the same sub-pixel geometry issue created by distant instanced objects. When the LODs of the main scene were tuned (see above), the shadow map creation improved similarly. The game got a small speedup by changing to LOD models, with impostors for distant models.

Landscape

Originally, the landscape was tessellated in screen space. This resulted in high polygon counts, so we replaced it with a tiled renderer. This had a high vertex shader cost, but rendered much faster (from 5.4 ms to 2.4 ms on the same scene).

ROME II uses very large landscape textures. The visible area is generated in real time. Height and terrain information is composited on the GPU and stored in a texture atlas, but required careful tuning. Each frame renders enough tiles to display newly visible areas, while limiting its work so that it doesn’t take too much GPU and interfere with the rest of rendering.

Particles

Looking more closely at a typical frame capture, there was an expensive section of work that had little effect on the frame.

Figure 8. Particles were expensive but had little effect

This was due to particles, which were each drawn as a separate polygon. They had no impact on the scene, so they could be removed for a large speedup.

Although it seemed that particles might not interact well with AOIT, the new particle engine developed for the game worked well and had no problems with AOIT.

Tuning the CPU-side code

Although graphics received a lot of attention during this project, the team also wanted to optimize the CPU side of ROME II. With Intel® VTune™ Amplifier XE, the team studied the game for CPU bottlenecks, and several areas yielded impressive gains.

The sound engine took more CPU time than expected. While it’s a powerful sound engine, capable of sophisticated mixing and blending on the CPU, it ran at its highest detail level even when set to “normal.” Fixing this sped up frames by up to 1.1x. Lower-power systems benefited from this, yielding a better frame rate and longer battery life.

The game includes a task-based threading system. The tasks vary in size, and some of them didn’t interact well with the automatic task scheduling from the task pool. This caused “bubbles” in the schedule and slowed the frame. Problematic tasks were removed from the task list and manually scheduled on their own thread, separate from everything else. This gave an optimal thread schedule.

Conclusion

Working together, Creative Assembly and Intel carefully studied ROME II during development. Using Intel GPA, Intel VTune Amplifier, GPUView and deep analysis, the team found many issues to improve. Together, the team tuned many parts of the game and integrated industry-leading algorithms.

Since the game automatically configures itself for each system, it’s faster in many cases than it would have been. The battery meter and in-game benchmark make it possible to study the game during gameplay or for benchmarking.

Foliage benefited from AOIT, LODs were isolated to the right depth and replaced with impostors in the distance, shadows are much faster, the landscape and particle systems work better and faster, and the sound and task systems use the right workloads at the right times.

Together, this all lets Total War: ROME II look and run great on Intel platforms. We hope this case study helps you do the same with your next game! Let us know what you think.

Huge thanks to Creative Assembly for building a great franchise and extending ROME II to shine on Intel platforms. Special thanks also to Steve Hughes of Intel, who has worked with Creative Assembly on a number of Total War games, delivering the best results to date in ROME II.

About the Author

Paul Lindberg is a Senior Software Engineer in Developer Relations at Intel. He helps game developers all over the world to ship kick-ass games and other apps that shine on Intel platforms.

Intel, the Intel logo, Iris, Ultrabook, and VTune are trademarks of Intel Corporation in the U.S. and/or other countries.
Copyright © 2013 Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.

Intel® SDK for OpenCL* Applications

↧

Student Hackathon in a Box

November 22, 2013, 4:08 pm

Latest and popular articles on Intel Technologies

≫ Next: Intro to Motion Estimation Extension for OpenCL*

≪ Previous: Not built in a day - lessons learned on Total War: ROME II

These events can show non-coders new potential, teach effective software development practices, help students acquire specific technology and interpersonal skills, and bridge the gap between academia and the real world. In school, you learn and then apply; in the real world, you have to apply without learning. These events help participants “learn how to learn,” learning through application.

The student-led hackathons will generally be remotely supported by Intel (funding, video calls, target platforms, etc.), run by student ambassadors with guidance and facility management from experienced faculty. One (or more) of the planners should be in charge of maintaining an event blog and gathering the code (generally on GitHub) for posterity.

Once your core team is in place, it’s time to start planning.

Preparation

Theme
- “Code for Good”- beneficial to society (e.g. teaching middle school students basic algebra, healthy lifestyle choices to combat childhood obesity, etc.)
- Not so general as to give no guidance or ideas
- Not so narrow as to specify app to be created
- Local community concerns are a good place to start
Participants
- Early signup
- Commitment
- Ongoing communication to maintain preparation and involvement
Internet
- Ethernet connections as backup
- Power outlets and strips
- Any platforms and servers required
Food
- Meals
- Grazing between
- Caffeine for late night boosts
- With Intel buying the food, it’s often best to order online (from Safeway or similar)
Facilities
- Power
- Lights
- AC
- Unlocked door(s) for access
- Parking and security if necessary
- All of the above assured overnight
Swag
- T-Shirts
- Stickers
- Prizes
- Certificates

Ordering t-shirts

A few things to keep in mind to keep the shirt designs standardized and useful:

Color limitation (not counting shirt color)
Badge section on front
Full event design on back
Code for Good logo on right sleeve
Mix of sizes or ask during signup

Specific technology training

Some tools, engines, and technologies have a steeper learning curve than others. Make sure to prepare a workshop before the hackathon when using these or the work time will be drastically inhibited by learning how to use them.

Downloadable toolkit image

It’s often helpful to package up the relevant tools for quick deployment to all development platforms (often student laptops). Here’s an example bundle:

Notepad++ for basic text editing
GIMP for images, Aseprite for sprites
Possibly a level editor such as Ogmo, Tiled, or Dame
Audacity, bfxr and/or musagi for audio
Event-specific tools such as XDK or Project Anarchy

Remember to consider the mix of operating systems.

Tips and tricks for a successful hackathon

Games keep participants motivated and are easier to demo/see progress
Competitive events would only appeal to half of the participants; collaboration is better for learning
Rather than judging with rankings, give tickets when participants exhibit beneficial behaviors or reach milestones. At the end, raffle prizes small to large.
Commitments should be for the entire duration; having a freeform “drop in and out” arrangement might feel more accommodating, but the damage to the team dynamic severely impedes productivity.
Schedule beforehand to have status check-in times at regular intervals (1-2 hrs).
Schedule meal breaks to not be near or after check-ins. Make sure these are at standard times- hungry people don’t care about much but food.
Maintain decorum during check-ins. Working through someone else’s status is fine, but having a disruptive conversation is simply rude.
Check in the loud groups first to quiet their conversations. If all are quiet, go with the quietest to stir them up.
Predetermine milestones. Estimate the first 25% for design and prototyping, alpha (functional engine working) by 50%, beta by 75% and gold by 90% to leave time for demo/wrap up.
“However long you think something will take, triple it.” – Jon Shafer. By that same note, cut your dev time into a third; if you’re doing a 24 hours hackathon, what can you get done in 8 hours? Not a sprawling MMO, aim for smaller scale.
Programmer art is fine. If you have a fun game that stars only boxes, you have a fun game. Polish is for later.
Have a camera, take pictures. Prep time, work time, check-in time, wrap-up, all of these are ripe with photo ops.
Similarly, have a video camera. Record check-ins, demos, and really any time someone talks to the group. It’s interesting to see the evolution of the games and teams over the course of the event.

More basic DIY hackathon tips are available at http://software.intel.com/codeforgood/hackathon-in-a-box/

Intel Student Hackathon

Hackathon in a Box

Code for Good

Intel Academic Community

Academic Program

Academics

Icon Image:

↧

Intro to Motion Estimation Extension for OpenCL*

November 25, 2013, 8:11 am

Latest and popular articles on Intel Technologies

≫ Next: Intel® Graphics Performance Analyzers for Android* OS

≪ Previous: Student Hackathon in a Box

Download Article

Download Intro to Motion Estimation Extension for OpenCL* [PDF 660KB]

This article introduces Intel’s motion estimation extension for OpenCL*. This extension includes a set of host-callable functions for frame-based Video Motion Estimation (VME).

This extension depends on the OpenCL 1.2 notion of the built-in kernels and on the cl_intel_accelerator vendor extension, which provide an abstraction for the specific hardware-accelerated capabilities.

This article provides a brief overview of the cl_intel_accelerator and cl_intel_motion_estimation extensions. A code example using these extensions is also included along with an explanation of its results.

For more information on extensions, refer to the cl_intel_accelerator and cl_intel_motion_estimation extension descriptions at the Khronos API registry.

Motion Estimation Overview

Motion estimation is the process of determining motion vectors that describe the transformation from one 2D image to another, usually from adjacent frames in a video sequence. The motion estimation functions, considered in this article, accept full-frame single-channel (luma) images as input, perform a motion search operation, and return a motion vector field as output.

The introduced VME functionality exposes part of the hardware acceleration pipeline for video acceleration. This VME extension provides low-level functionality, currently restricted to the single-channel (luma) input images and block matching methods, so motion vectors are computed for rectangular pixel blocks. Motion vectors are key elements in the video compression algorithms.

Motion vectors are useful for several applications. For example, when generating “slow motion effects,” motion vectors can provide the basis to generate intermediate frames for frame rate (up)conversion. Another example is increasing the original frame rate of the digitized film (24 fps) to match the TV rate.

Motion vectors are also useful for image stabilization: the motion vectors in the entire frame can be averaged to produce a “global” motion vector that can serve as an approximation to a real video camera motion.

The motion estimation extension consists of the new OpenCL built-in kernel (see section 5.6.1 in the OpenCL 1.2 specification) which performs motion estimation, as well as the accelerator object, which represents the state of the underlying acceleration engine. The kernel is queued for execution from the host using the standard ND-range mechanism.

Both cl_intel_accelerator and cl_intel_motion_estimation extensions should be listed in the CL_DEVICE_EXTENSIONS string (see Table 4.3 in the OpenCL 1.2 specification) for the Intel® HD Graphics device in your system. Otherwise you need to update your GPU driver first.

General Accelerator API

Creating an accelerator object

Accelerator objects provide a black-box abstraction of software- and/or hardware-accelerated functionality from OpenCL vendors. Intel cl_intel_accelerator vendor extension consists of a unified set of OpenCL runtime APIs to create, query, and manage the lifetime of the accelerator objects. The interfaces for this extension are provided in the cl_ext.h header.

Just as with other vendor extension APIs, the clGetExtensionFunctionAddressForPlatform function should be used to get pointers to the accelerator APIs:

static clCreateAcceleratorINTEL_fn pfn_clCreateAcceleratorINTEL = (clCreateAcceleratorINTEL_fn)
clGetExtensionFunctionAddressForPlatform(intel_platform_id, "clCreateAcceleratorINTEL");

clCreateAcceleratorINTEL_fn is defined as an appropriate function pointer in the cl_ext.h.

Accelerator object instances are referenced with the generic cl_accelerator_intel type. Notice that every accelerator is always associated with a specific acceleration engine type, which is requested by the application at accelerator object creation time. In the example below, the accelerator type is CL_ACCELERATOR_TYPE_MOTION_ESTIMATION_INTEL. Also, descriptors are used to request acceleration engine-specific properties:

cl_motion_estimation_desc_intel desc = {
CL_ME_MB_TYPE_16x16_INTEL,                                     
CL_ME_SUBPIXEL_MODE_INTEGER_INTEL,              
CL_ME_SAD_ADJUST_MODE_NONE_INTEL,                 
CL_ME_SEARCH_PATH_RADIUS_16_12_INTEL              
};
cl_accelerator_intel accelerator = pfn_clCreateAcceleratorINTEL(context, CL_ACCELERATOR_TYPE_MOTION_ESTIMATION_INTEL,
        sizeof(cl_motion_estimation_desc_intel), &desc, &err);

Refer to the full motion estimation extension specification for the descriptor details. Make sure to handle potential failure of the creation routine when clCreateAcceleratorINTEL returns zero for the accelerator handle value. Possible reasons for accelerator creation failure are invalid descriptors or an invalid combination of descriptor values. The extension specification lists all possible error codes and causes.

clReleaseAcceleratorINTEL is a complement to the creation API we just discussed. Refer to the Full Frame Motion Estimation Code Example section of this article for the example code.

Using the accelerator object

An application can run the accelerated motion estimation functions on an OpenCL device by enqueuing one of the proposed built-in kernels (below). The kernels are enqueued for execution by the regular clEnqueueNDRangeKernel OpenCL routine. In turn, a motion estimation accelerator encapsulates the internal state of the motion estimation engine and serves as the kernel argument to the motion estimation built-in kernel. The relationships between the entities are outlined in the following diagram:

Motion Estimation API

Notion of built-in kernels

Section 5.6.1 of the OpenCL 1.2 specification introduces the notion of built-in kernels. More specifically, clCreateProgramWithBuiltInKernels creates a program object given a context and loads the information related to the built-in kernels into the program object. Notice that the developer does not provide program source code for built-in kernels.

cl_program program = clCreateProgramWithBuiltInKernels(context,1,device,"block_motion_estimate_intel",&err);

The specific built-in kernels are created from the resulting program object:

cl_kernel kernel = clCreateKernel(program, "block_motion_estimate_intel", &err);

The kernels can be enqueued for execution by the OpenCL runtime using clEnqueueNDRangeKernel.

Built-in kernel for the motion estimation

The cl_intel_motion_estimation extension introduces a new built-in kernel for motion estimation with the following signature:

_kernel void 
block_motion_estimate_intel
(
accelerator_intel_t accelerator,
__read_only  image2d_t src_image,
__read_only  image2d_t ref_image,
__global short2 * prediction_motion_vector_buffer,
__global short2 * motion_vector_buffer,
__global ushort * residuals
);

This kernel computes motion vectors by comparing a 2D image source with a 2D reference image, producing a vector field of motion vectors. The algorithm searches the best match of each pixel block in the source image by searching an image region in the reference image, centered on the coordinates of that pixel block in the source image (optionally offset by the prediction motion vectors).

When enqueuing this kernel, global_work_size and global_work_offset determine the region of interest of the input frames. The dimension of the output motion vector image is dependent on the size of the region of interest and partitioning mode specified by the accelerator.

accelerator should be a valid accelerator object created by clCreateAcceleratorINTEL, where the type of the accelerator must be CL_ACCELERATOR_TYPE_MOTION_ESTIMATION_INTEL.

src_image and the ref_image images should represent 8-bit luminance information. image_channel_order and the image_data_type of src_image/ref_image are restricted as follows:

Channel Order	Src Channel Data Type
CL_R	CL_UNORM_INT8

motion_vector_buffer represents an output vector field of pixel block motion vectors stored linearly in row-major order. Each entry of the buffer is a motion vector (packed as two 16-bit integer values) for the corresponding pixel block. The buffer needs to be sized appropriately such that it fits the results of all pixel blocks of the source image. The number of returned motion vectors per source pixel block is determined by the mb_block_type defined at accelerator creation time. Therefore, the total number of the motion vectors is the number of source (16x16) pixel blocks times the number of returned motion vectors per source block (1, 4, or 16).

This kernel optionally takes a buffer of motion vector predictors via the prediction_motion_vector_buffer kernel argument. In many algorithms the motion vectors from the previous frame are used as prediction vectors for the current frame. Prediction vectors can also be used to estimate the motion vectors of the downscaled input image to implement hierarchical motion estimation algorithms. Essentially, using prediction vectors overcomes the hardware limitation on the maximum search radius, as you can offset the neighborhood to be searched, which can be coupled with a multi-pass approach to enable searching within arbitrary areas.

The application can choose not to provide prediction motion vectors by providing NULL as the arg_value argument to clSetKernelArg(), in which case the prediction motion vectors are implied to be (0,0).

A buffer of per-pixel-block distortion values (or “residuals”) can optionally be returned as well, which provides the sum-of-absolute-differences between best-match source and reference frame pixel blocks that produced the corresponding motion vector. The application can choose not to get the residuals by providing NULL as the arg_value argument to clSetKernelArg(), in which case this information is not returned.

Refer to the extension specification document for details.

The clEnqueueNDRangeKernel() for the built-in kernel returns the usual error codes, augmented with a few VME specific error codes, described in the extension specification document. Particularly notice that this built-in kernel requires the local size to be NULL to let the work-group size be determined at runtime, and it requires 2D ND-range. Otherwise the clEnqueueNDRangeKernel() call fails and returns an error as described in the specification.

Full Frame Motion Estimation Code Example

The following code snippet demonstrates how to set up and queue a simple full-frame motion estimation pass for 16x16 pixel blocks (and a single resulting motion vector per block).

    cl_platform_id platform;
    cl_context context;
    cl_device_id device;
    cl_command_queue queue;

// Initialize OpenCL via selecting Intel platform, create context with GPU device and a queue for the device as usual
…

// Get the func pointers to the accelerator routines 
static clCreateAcceleratorINTEL_fn pfn_clCreateAcceleratorINTEL = (clCreateAcceleratorINTEL_fn)
clGetExtensionFunctionAddressForPlatform(platform, "clCreateAcceleratorINTEL");

// Create the program and the built-in kernel for the motion estimation
cl_program program = clCreateProgramWithBuiltInKernels(context,1,device,"block_motion_estimate_intel",NULL);    
cl_kernel kernel = clCreateKernel(program, "block_motion_estimate_intel", NULL);    

// Create the accelerator for the motion estimation 
    cl_motion_estimation_desc_intel desc = { // VME API configuration knobs
// Num of motion vectors per source pixel block, here a single vector per block
       CL_ME_MB_TYPE_16x16_INTEL,                                     
       CL_ME_SUBPIXEL_MODE_INTEGER_INTEL, // Motion vector precision
// Adjust mode for the residuals, we don't compute them in this tutorial anyway: 
       CL_ME_SAD_ADJUST_MODE_NONE_INTEL,  
       CL_ME_SEARCH_PATH_RADIUS_16_12_INTEL // Search window radius
    };
    cl_accelerator_intel accelerator = 
        pfn_clCreateAcceleratorINTEL(context, 
        CL_ACCELERATOR_TYPE_MOTION_ESTIMATION_INTEL, 
        sizeof(cl_motion_estimation_desc_intel), &desc, 0);

    // Input images
    cl_image_format format = { CL_R, CL_UNORM_INT8 }; // luminance plane
    cl_mem srcImage = clCreateImage2D(context, CL_MEM_READ_ONLY, &format, 
        width, height, 0, pSrcBuf, &err);
    cl_mem refImage = clCreateImage2D(context, CL_MEM_READ_ONLY, &format, 
        width, height, 0, pRefBuf, &err);

    // Compute number of output motion vectors 
    const int mbSize = 16; // size of the (input) pixel motion block
    size_t widthInMB  = (width + mbSize - 1) / mbSize;        
    size_t heightInMB = (height + mbSize - 1) / mbSize;
    // Output buffer for MB motion vectors
    cl_mem outMVBuffer = clCreateBuffer(context, CL_MEM_WRITE_ONLY, 
	widthInMB * heightInMB * sizeof(cl_short2), 0, 0, &err);

    // Setup params for the built-in kernel 
    clSetKernelArg(kernel, 0, sizeof(cl_accelerator_intel), &accelerator);
    clSetKernelArg(kernel, 1, sizeof(cl_mem), &srcImage);
    clSetKernelArg(kernel, 2, sizeof(cl_mem), &refImage);
    clSetKernelArg(kernel, 3, sizeof(cl_mem), NULL); // disable predictor motion vectors
    clSetKernelArg(kernel, 4, sizeof(cl_mem), &outMVBuffer);
    clSetKernelArg(kernel, 5, sizeof(cl_mem), NULL); // disable extra motion block info output

    // Run the kernel
    // Notice that it *requires* to let runtime determine the local size, and requires 2D ndrange
    const size_t originROI[2] = { 0, 0 };
    const size_t sizeROI[2] = { width, height};
    clEnqueueNDRangeKernel(queue, kernel, 2, originROI, sizeROI, NULL, 0, 0, 0);

    // Read resulting motion vectors
    clEnqueueReadBuffer(queue, outMVBuffer, CL_TRUE, 0,
widthInMB * heightInMB * sizeof(cl_short2), pMVOut, 0, 0, 0);

    //clReleaseAcceleratorINTEL(accelerator);
    // Release other resources
    …

Example Results

The pictures below show two frames (reference and source) and computed motion vectors overlaid on the second frame. Specifically, the vectors are rendered as the strokes of the appropriate magnitude. So they point to the new (actually best-matched) pixel blocks positions.

Notice the radial pattern of the motion vectors, which is due to the nature of the transformation between frames (zoom in addition to the camera movement).

VME Performance versus Quality Considerations

You should carefully consider performance versus quality trade-offs when using hardware-assisted VME through the cl_intel_motion_estimation extension. Such trade-offs might be simply a function of the input image size, or of the requested density of motion vectors to be computed. The VME implementation computes motion vectors on a 16x16 source pixel block. You can control the number of output motion vectors to be computed by defining the number of output sub-blocks on each of these 16x16 source pixel blocks. As one would expect, requesting more output motion vectors increases VME computation cost.

More precisely, the mb_block_type field of the cl_motion_estimation_desc_intel (refer to the section on the general accelerator API), which is used during VME accelerator creation, defines the number of “sub-blocks” (and hence of motions vectors) within each 16x16 source pixel block:

It is important to understand that sub-blocks are independent. Thus, the smaller the sub-block, the more likely the VME implementation finds a match. Smaller sub-blocks may be appropriate for compression applications that efficiently encode sub-block differences, but less appropriate for feature tracking applications. Using smaller sub-blocks not only increases VME computation expense cost, it also decreases motion vector field smoothness (motion vector directions seem to look more noisy, see the figure below). Feature tracking applications may benefit from a larger sub-block size because they more closely match feature sizes and motion prediction smoothness is often desirable in such applications. Therefore, using larger sub-blocks decreases VME computation expense and improves application performance.

Below are examples of resulting motion vectors fields with the sub-block size varied:

4x4 (16 resulting motion vectors per block), the leftmost image
8x8 (4 resulting motion vectors per block), the middle image
16x16 (single motion vector per block), the rightmost image

Sub-block sizes are specified with the mb_block_type field of the cl_motion_estimation_desc_intel, a parameter of the VME accelerator routine. Notice the noisiness of the motion vector field in the leftmost image; the noisiness decreases in images to the right:

Conclusion

Computing motions vectors is a key component of many popular video compression and computer vision algorithms. As it is a computationally-intensive task, pure software implementations might present performance or energy efficiency challenges for some applications. In this article, we introduced a Video Motion Estimation (VME) extension for OpenCL* that leverages hardware-assisted motion vectors estimation. We showed how to employ the set of VME extension host-callable functions for the task of computing motion vectors. Specifically, using this VME extension, one can estimate motion in a frame, while trading off the number of resulting motion vectors against computation cost.

About the Author

Maxim Shevtsov is a Software Architect in the OpenCL performance team at Intel. He received his Masters degree in Computer Science in 2003. Prior to joining Intel in 2005, he was doing various academia studies in computer graphics.

Intel, the Intel logo, Iris, Ultrabook, and VTune are trademarks of Intel Corporation in the U.S. and/or other countries.

*Other names and brands may be claimed as the property of others.

OpenCL*

Video Motion Estimation

Learning Lab

OpenCL-SDK-Learn

↧

Intel® Graphics Performance Analyzers for Android* OS

December 9, 2013, 12:00 am

Latest and popular articles on Intel Technologies

≫ Next: Intel® GPA: Windows* 7/8/8.1 OS Support

≪ Previous: Intro to Motion Estimation Extension for OpenCL*

Introduction

The Intel® Graphics Performance Analyzers (Intel® GPA) suite is a set of powerful graphics and gaming analysis tools that are designed to work the way game developers do, saving valuable optimization time by quickly providing actionable data to help developers find performance opportunities from the system level down to the individual draw call.

Intel® GPA now supports Intel® Atom™ based phones running the Google* Android* OS. This version of the toolset allows application and driver engineers to optimize their OpenGL* ES 1.0/2.0 workloads on these phones using your choice of development systems: Windows*, OS X*, or Ubuntu* OS. With this capability, as an Android* developer you can:

get a real-time view of over two dozen critical system metrics covering the CPU, GPU, and OpenGL* ES API
conduct a number of graphics pipeline experiments to isolate graphics bottlenecks

To download a free copy of Intel GPA, browse to the Intel GPA Home Page, and click the Download button.

Next Steps

For more details on getting started with Intel GPA on the Android* OS, please refer to this article. You can also find more details on using Intel GPA by browsing the product's online help. The Intel GPA home page also contains links to product information, including information about analyzing DirectX* games on the Windows* OS platform, and related products that work with Intel GPA.

If you want to be notified of Intel GPA product updates, use this link.

As always, we welcome your suggestions, so please let us know what we can do to improve your use of these tools by posting your comments on the Intel GPA Support Forum.

*Other names and brands may be claimed as the property of others.

vcsource_domain_gamedev

Developers

Android*

Phone

↧

Intel® GPA: Windows* 7/8/8.1 OS Support

December 9, 2013, 12:00 am

Latest and popular articles on Intel Technologies

≫ Next: Android 开发之多线程处理、Handler 详解

≪ Previous: Intel® Graphics Performance Analyzers for Android* OS

Introduction

Whereas Intel GPA supports the analysis of games and graphics applications on both the Windows* OS and the Android* OS for Intel® Atom™ phones, this article discusses the product's support of the Windows* OS platform. To learn more about Intel GPA on the Android* OS, see this article.

Intel GPA fully supports all client versions of both Microsoft* Windows* 7 and Widows* 8/8.1 OS, including both 32-bit and 64-bit versions of this operating system. However, note that Intel GPA does not support applications using the Microsoft* Windows* 8 RT version of this operating system, or the "starter" or "server" versions. Also, Intel GPA supports the following versions of DirectX*: 9/9Ex, 10.0/10.1, and 11.0. Furthermore, at this time the Intel GPA System Analyzer tool is not able to create frame capture or trace capture files for WIndows* 8/8.1 Store Applications.

To use Intel GPA on the Windows* OS platform, you either can run it in

single-system mode, where the Intel GPA tools run on the same system as your game; install Intel GPA on this system
"remote" mode, where your game and the tools run on different systems; install the product on both the client and target systems

However, one big tip for developers -- ensure that you have the latest versions of device drivers for your graphics system, since most Intel® Processor Graphics systems require updated drivers for the Windows*8 OS. Download the latest Intel graphics driver for the Windows* OS from the Intel driver download site.

Where to Next...

The Intel GPA home page also contains links to product information, including online help and release notes, and information about the Android* OS version of the product. Here you'll also find information on other Intel products that work together with Intel GPA.

vcsource_platform_desktoplaptop

vcsource_os_windows

vcsource_domain_graphics

vcsource_domain_gamedev

Microsoft Windows* 8

Microsoft Windows* 8 Desktop

Graphics

Microsoft Windows* 8 Style UI

↧

Android 开发之多线程处理、Handler 详解

May 8, 2013, 2:10 am

Latest and popular articles on Intel Technologies

≫ Next: How to Integrate Intel® Perceptual Computing SDK with Cocos2D-x

≪ Previous: Intel® GPA: Windows* 7/8/8.1 OS Support

Android开发过程中为什么要多线程

我们创建的Service、Activity以及Broadcast均是一个主线程处理，这里我们可以理解为UI线程。但是在操作一些耗时操作时，比如I/O读写的大文件读写，数据库操作以及网络下载需要很长时间，为了不阻塞用户界面，出现ANR的响应提示窗口，这个时候我们可以考虑使用Thread线程来解决。

Android中使用Thread线程会遇到哪些问题

对于从事过J2ME开发的程序员来说Thread比较简单，直接匿名创建重写run方法，调用start方法执行即可。或者从Runnable接口继承，但对于Android平台来说UI控件都没有设计成为线程安全类型，所以需要引入一些同步的机制来使其刷新，这点Google在设计Android时倒是参考了下Win32的消息处理机制。

postInvalidate()方法

对于线程中的刷新一个View为基类的界面，可以使用postInvalidate()方法在线程中来处理，其中还提供了一些重写方法比如postInvalidate(int left,int top,int right,int bottom) 来刷新一个矩形区域，以及延时执行，比如postInvalidateDelayed(long delayMilliseconds)或postInvalidateDelayed(long delayMilliseconds,int left,int top,int right,int bottom) 方法，其中第一个参数为毫秒，如下：

void	postInvalidate()
void	postInvalidate(int left, int top, int right, int bottom)
void	postInvalidateDelayed(long delayMilliseconds)
void	postInvalidateDelayed(long delayMilliseconds, int left, int top, int right, int bottom)

Handler

当然推荐的方法是通过一个Handler来处理这些，可以在一个线程的run方法中调用handler对象的postMessage或sendMessage方法来实现，Android程序内部维护着一个消息队列，会轮训处理这些，如果你是Win32程序员可以很好理解这些消息处理，不过相对于Android来说没有提供PreTranslateMessage这些干涉内部的方法。

消息的处理者，handler负责将需要传递的信息封装成Message，通过调用handler对象的obtainMessage()来实现。将消息传递给Looper，这是通过handler对象的sendMessage()来实现的。继而由Looper将Message放入MessageQueue中。当Looper对象看到MessageQueue中含有Message，就将其广播出去。该handler对象收到该消息后，调用相应的handler对象的handleMessage()方法对其进行处理。

Handler主要接受子线程发送的数据,并用此数据配合主线程更新UI.
当应用程序启动时，Android首先会开启一个主线程 (也就是UI线程) , 主线程为管理界面中的UI控件，进行事件分发,比如说,你要是点击一个 Button ,Android会分发事件到Button上，来响应你的操作。如果此时需要一个耗时的操作，例如:联网读取数据，或者读取本地较大的一个文件的时候，你不能把这些操作放在主线程中，，如果你放在主线程中的话，界面会出现假死现象,如果5秒钟还没有完成的话，，会收到Android系统的一个错误提示 "强制关闭". 这个时候我们需要把这些耗时的操作，放在一个子线程中,因为子线程涉及到UI更新，，Android主线程是线程不安全的，也就是说，更新UI只能在主线程中更新，子线程中操作是危险的.这个时候，Handler就出现了,来解决这个复杂的问题, 由于Handler运行在主线程中(UI线程中), 它与子线程可以通过Message对象来传递数据,这个时候，Handler就承担着接受子线程传过来的(子线程用sedMessage()方法传弟)Message对象，(里面包含数据) ,把这些消息放入主线程队列中，配合主线程进行更新UI。

Handler一些特点：handler可以分发Message对象和Runnable对象到主线程中,每个Handler实例,都会绑定到创建他的线程中(一般是位于主线程),
它有两个作用: (1)安排消息或Runnable在某个主线程中某个地方执行

                           (2)安排一个动作在不同的线程中执行
        Handler中分发消息的一些方法
        post(Runnable)
        postAtTime(Runnable,long)
        postDelayed(Runnable long)
        sendEmptyMessage(int)
        sendMessage(Message)
        sendMessageAtTime(Message,long)
        sendMessageDelayed(Message,long)
      以上post类方法允许你排列一个Runnable对象到主线程队列中,sendMessage类方法,允许你安排一个带数据的Message对象到队列中，等待更新.

Handler实例
// 子类需要继承Hendler类，并重写handleMessage(Message msg) 方法,用于接受线程数据
// 以下为一个实例，它实现的功能为 :通过线程修改界面Button的内容

public class MyHandlerActivity extends Activity {

Button button;

MyHandler myHandler;

protected void onCreate(Bundle savedInstanceState) {

super.onCreate(savedInstanceState);

setContentview(R.layout.handlertest);

button = (Button) findViewById(R.id.button);

myHandler = new MyHandler();

//当创建一个新的Handler实例时,它会绑定到当前线程和消息的队列中,开始分发数据

// Handler有两个作用, (1) :定时执行Message和Runnalbe对象

// (2):让一个动作,在不同的线程中执行.

//它安排消息,用以下方法

// post(Runnable)

// postAtTime(Runnable,long)

// postDelayed(Runnable,long)

// sendEmptyMessage(int)

// sendMessage(Message);

// sendMessageAtTime(Message,long)

// sendMessageDelayed(Message,long)

//以上方法以 post开头的允许你处理Runnable对象

//sendMessage()允许你处理Message对象(Message里可以包含数据,)

MyThread m = new MyThread();

new Thread(m).start();

}

/**

*接受消息,处理消息 ,此Handler会与当前主线程一块运行

* */

class MyHandler extends Handler {

public MyHandler() {

}

public MyHandler(Looper L) {

super(L);

}

//子类必须重写此方法,接受数据

@Override

public void handleMessage(Message msg) {

// TODO Auto-generated method stub

Log.d("MyHandler", "handleMessage......");

super.handleMessage(msg);

//此处可以更新UI

Bundle b = msg.getData();

String color = b.getString("color");

MyHandlerActivity.this.button.append(color);

}

class MyThread implements Runnable {

public void run() {

try {

Thread.sleep(10000);

} catch (InterruptedException e) {

// TODO Auto-generated catch block

e.printStackTrace();

}

Log.d("thread.......", "mThread........");

Message msg = new Message();

Bundle b = new Bundle();//存放数据

b.putString("color", "我的");

msg.setData(b);

MyHandlerActivity.this.myHandler.sendMessage(msg); //向Handler发送消息,更新UI

}

}
}

Looper

其实Android中每一个Thread都跟着一个Looper，Looper可以帮助Thread维护一个消息队列，昨天的问题 Can't create handler inside thread 错误一文中提到这一概念，但是Looper和Handler没有什么关系，我们从开源的代码可以看到Android还提供了一个Thread继承类HanderThread可以帮助我们处理，在HandlerThread对象中可以通过getLooper方法获取一个Looper对象控制句柄，我们可以将其这个Looper对象映射到一个Handler中去来实现一个线程同步机制，Looper对象的执行需要初始化Looper.prepare方法就是昨天我们看到的问题，同时推出时还要释放资源，使用Looper.release方法。

Looper是MessageQueue的管理者。每一个MessageQueue都不能脱离Looper而存在，Looper对象的创建是通过prepare函数来实现的。同时每一个Looper对象和一个线程关联。通过调用Looper.myLooper()可以获得当前线程的Looper对象
创建一个Looper对象时，会同时创建一个MessageQueue对象。除了主线程有默认的Looper，其他线程默认是没有MessageQueue对象的，所以，不能接受Message。如需要接受，自己定义一个Looper对象(通过prepare函数),这样该线程就有了自己的Looper对象和MessageQueue数据结构了。
Looper从MessageQueue中取出Message然后，交由Handler的handleMessage进行处理。处理完成后，调用Message.recycle()将其放入Message Pool中。

Message

对于Android中Handler可以传递一些内容，通过Bundle对象可以封装String、Integer以及Blob二进制对象，我们通过在线程中使用Handler对象的 sendEmptyMessage或sendMessage方法来传递一个Bundle对象到Handler处理器。对于Handler类提供了重写方法handleMessage(Message msg) 来判断，通过msg.what来区分每条信息。将Bundle解包来实现Handler类更新UI线程中的内容实现控件的刷新操作。相关的Handler对象有关消息发送sendXXXX相关方法如下，同时还有postXXXX相关方法，这些和Win32中的道理基本一致，一个为发送后直接返回，一个为处理后才返回。

Message：消息对象，Message Queue中的存放的对象。一个Message Queue中包含多个Message。 Message实例对象的取得，通常使用Message类里的静态方法obtain(),该方法有多个重载版本可供选择；它的创建并不一定是直接创建一个新的实例，而是先从Message Pool(消息池)中看有没有可用的Message实例，存在则直接取出返回这个实例。如果Message Pool中没有可用的Message实例，则才用给定的参数创建一个Message对象。调用removeMessages()时，将Message从Message Queue中删除，同时放入到Message Pool中。除了上面这种方式，也可以通过Handler对象的obtainMessage()获取一个Message实例。

final boolean	sendEmptyMessage(int what)
final boolean	sendEmptyMessageAtTime(int what, long uptimeMillis)
final boolean	sendEmptyMessageDelayed(int what, long delayMillis)
final boolean	sendMessage(Message msg)
final boolean	sendMessageAtFrontOfQueue(Message msg)
boolean	sendMessageAtTime(Message msg, long uptimeMillis)
final boolean	sendMessageDelayed(Message msg, long delayMillis)

MessageQueue

是一种数据结构，见名知义，就是一个消息队列，存放消息的地方。每一个线程最多只可以拥有一个MessageQueue数据结构。
创建一个线程的时候，并不会自动创建其MessageQueue。通常使用一个Looper对象对该线程的MessageQueue进行管理。主线程创建时，会创建一个默认的Looper对象，而Looper对象的创建，将自动创建一个Message Queue。其他非主线程，不会自动创建Looper，要需要的时候，通过调用prepare函数来实现。
java.util.concurrent对象分析

对于过去从事Java开发的程序员不会对Concurrent对象感到陌生吧，他是JDK 1.5以后新增的重要特性作为掌上设备，我们不提倡使用该类，考虑到Android为我们已经设计好的Task机制，我们这里Android开发网对其不做过多的赘述。

Task以及AsyncTask

在Android中还提供了一种有别于线程的处理方式，就是Task以及AsyncTask，从开源代码中可以看到是针对Concurrent的封装，开发人员可以方便的处理这些异步任务。当然涉及到同步机制的方法和技巧还有很多，考虑时间和篇幅问题不再做过多的描述。

Curated Home

Icon Image:

↧

How to Integrate Intel® Perceptual Computing SDK with Cocos2D-x

December 4, 2013, 1:34 pm

Latest and popular articles on Intel Technologies

≫ Next: Intel® Graphics Performance Analyzers (Intel® GPA) FAQ

≪ Previous: Android 开发之多线程处理、Handler 详解

Downloads

How to Integrate Intel® Perceptual Computing SDK with Cocos2D-x [PDF 482KB]

Introduction

In this article, we will explain the project we worked on as part of the Intel® Perceptual Computing Challenge Brazil, where we managed to achieve 7th place. Our project was Badaboom, a rhythm game set in the Dinosaur Era where the player controls a caveman, named Obo, by hitting bongos at the right time. If you’re curious to see the game in action, check out our video of Badaboom:

To begin, you’ll need to understand a bit about Cocos2D-X, an open-source game engine that is widely used to create games for iPhone* and Android*. The good thing about Cocos2D-X is that it is cross-platform and thus is used to create apps for Windows* Phone, Windows 8, Win32*, Linux*, Mac*, and almost any platform you can think of. For more information, go to www.cocos2dx.org.

We will be using the C++ version of the SDK (Version 9302) as well as the Cocos2D-X v2.2 (specifically the Win32 build with Visual Studio* 2012). Following the default pattern of Cocos2D, we will create a wrapper that receives and processes the data from the Creative* Interactive Gesture Camera and interprets it as “touch” for our game.

Setting the environment

To start, you’ll need to create a simple Cocos2D project. We will not cover this subject as it is not the focus of our article. If you need more information, you can find it on the Cocos2D wiki (www.cocos2dx.org/wiki).

To keep it simple, execute the Python* script to create a new project in the “tools” folder of Cocos2d-x and open the Visual Studio project. Now we will add the Intel Perceptual Computing SDK to the project.

To handle the SDK’s input, we will create a singleton class named CameraManager. This class starts the camera, updates the cycle, and adds two images to the screen that represent the position of the hands on the game windows.

CameraManager is a singleton class that is derived from UtilPipeline and imports the “util_pipeline.h” file. Here, we need to reconfigure some of the Visual Studio project properties. Figure 1 shows how to add the additional include directories for the Intel Perceptual Computing SDK.

$(PCSDK_DIR)/include
$(PCSDK_DIR)/sample/common/include
$(PCSDK_DIR)/sample/common/res

Figure 1. Additional include directories

You must also include the following paths to the additional library directories:

$(PCSDK_DIR)/lib/$(PlatformName)
$(PCSDK_DIR)/sample/common/lib/$(PlatformName)/$(PlatformToolset)

Figure 2. Additional Library Directories

Add the following dependencies in the input section:

libpxc_d.lib
libpxcutils_d.lib

Figure 3. Additional Dependencies

Now we are ready to work on our CameraManager!

Start Coding!

First we need to make the class a singleton. In other words, the class needs to be accessible from anywhere in the code with the same instance (singleton classes have only one instance). For this, you can use a method:

CameraManager* CameraManager::getInstance(void)
{
    if (!s_Instance)
    {
        s_Instance = new CameraManager();
    }

    return s_Instance;
}

After that, we’ll build a constructor, a method that starts the camera:

CameraManager::CameraManager(void)
{
	if (!this->IsImageFrame()){
		this->EnableGesture();

		if (!this->Init()){
			CCLOG("Init Failed");
		}
	}

	this->hand1sprite = NULL;
	this->hand2sprite = NULL;

	hasClickedHand1 = false;
	hasClickedHand2 = false;

	this->inputAreas = CCArray::createWithCapacity(50);
	this->inputAreas->retain();
}

Many of the commands initialize variables that handle sprites which symbolize the users’ hands and get input as they close their hands. The next step is processing the data that comes from the camera.

void CameraManager::processGestures(PXCGesture *gesture){
	
	PXCGesture::Gesture gestures[2]={0};

	gesture->QueryGestureData(0,PXCGesture::GeoNode::LABEL_BODY_HAND_PRIMARY,0,&gestures[0]);
	gesture->QueryGestureData(0,PXCGesture::GeoNode::LABEL_BODY_HAND_SECONDARY,0,&gestures[1]);

	
	CCEGLView* eglView = CCEGLView::sharedOpenGLView();
	switch (gestures[0].label)
	{
	case (PXCGesture::Gesture::LABEL_POSE_THUMB_DOWN):
		CCDirector::sharedDirector()->end();
		break;
	case (PXCGesture::Gesture::LABEL_NAV_SWIPE_LEFT):
		CCDirector::sharedDirector()->popScene();
		break;
	}
}

To be clear, it is in this method that you can also add switch cases to understand voice commands and to implement more gesture handlers. Following this, we must process this information and display it in the CCLayer (Cocos2D sprite layer).

bool CameraManager::Start(CCNode* parent){
	this->parent = parent;

	if (this->hand1sprite!=NULL
    &&  this->hand1sprite->getParent()!=NULL){
		this->hand1sprite->removeFromParentAndCleanup(true);
		this->hand2sprite->removeFromParentAndCleanup(true);
	}

	this->hand1sprite = CCSprite::create("/Images/hand.png");
	this->hand1sprite->setOpacity(150);
    //To make it out of screen
	this->hand1sprite->setPosition(ccp(-1000,-1000));
	this->hand1Pos = ccp(-1000,-1000);
	
	this->hand2sprite = CCSprite::create("/Images/hand.png");
	this->hand2sprite->setFlipX(true);
	this->hand2sprite->setOpacity(150);
	this->hand2sprite->setPosition(ccp(-1000,-1000));
	this->hand2Pos = ccp(-1000,-1000);

	parent->addChild(this->hand1sprite, 1000);
	parent->addChild(this->hand2sprite, 1000);
	
	this->inputAreas->removeAllObjects();
	return true;
}

This method should be called each time a new frame is placed on the screen (most of the time into the onEnter callback). It will automatically remove the hand sprites from the previous parent and add them to the new CCLayer.

Now that our hand sprites have been added to the CCLayer we are able to handle their position by calling the follow method on the update cycle of the CCLayer (which is scheduled by the call: “this->scheduleUpdate();”). The update method is as follows:

void CameraManager::update(float dt){

	if (!this->AcquireFrame(true)) return;

	PXCGesture *gesture=this->QueryGesture();
	
	this->processGestures(gesture);

	PXCGesture::GeoNode nodes[2][1]={0};
	
    gesture->> QueryNodeData(0,PXCGesture::GeoNode::LABEL_BODY_HAND_PRIMARY,1,nodes[0]);
	gesture-> QueryNodeData(0,PXCGesture::GeoNode::LABEL_BODY_HAND_SECONDARY,1,nodes[1]);

	CCSize _screenSize = CCDirector::sharedDirector()->getWinSize();

	
	if (nodes[0][0].openness<20 && !this->hand1Close){
		this->hand1sprite->removeFromParentAndCleanup(true);
		this->hand1sprite = CCSprite::create("/Images/hand_close.png");
		this->hand1sprite->setOpacity(150);
		this->parent->addChild(hand1sprite);
		this->hand1Close = true;
	} else if (nodes[0][0].openness>30 && this->hand1Close) {
		this->hand1sprite->removeFromParentAndCleanup(true);
		this->hand1sprite = CCSprite::create("/Images/hand.png");
		this->hand1sprite->setOpacity(150);
		this->parent->addChild(hand1sprite);
		this->hand1Close = false;
	}
	
	if (nodes[1][0].openness<20 && !this->hand2Close){
		this->hand2sprite->removeFromParentAndCleanup(true);
		this->hand2sprite = CCSprite::create("/Images/hand_close.png");
		this->hand2sprite->setFlipX(true);
		this->hand2sprite->setOpacity(150);
		this->parent->addChild(hand2sprite);
		this->hand2Close = true;
	} else if (nodes[1][0].openness>30 && this->hand2Close) {
		this->hand2sprite->removeFromParentAndCleanup(true);
		this->hand2sprite = CCSprite::create("/Images/hand.png");
		this->hand2sprite->setFlipX(true);
		this->hand2sprite->setOpacity(150);
		this->parent->addChild(hand2sprite);
		this->hand2Close = false;
	}

	this->hand1Pos = ccp(_screenSize.width*1.5-nodes[0][0].positionImage.x*(_screenSize.width*HAND_PRECISION/320) + 100,
						 _screenSize.height*1.5-nodes[0][0].positionImage.y*(_screenSize.height*HAND_PRECISION/240));
	this->hand2Pos = ccp(_screenSize.width*1.5-nodes[1][0].positionImage.x*(_screenSize.width*HAND_PRECISION/320) - 100,
						 _screenSize.height*1.5-nodes[1][0].positionImage.y*(_screenSize.height*HAND_PRECISION/240));

	if (!hand1sprite->getParent() || !hand2sprite->getParent()){
		return;
	}
	this->hand1sprite->setPosition(this->hand1Pos);
	this->hand2sprite->setPosition(this->hand2Pos);

	
    CCObject* it = NULL;
	CCARRAY_FOREACH(this->inputAreas, it)
	{
		InputAreaObject* area = dynamic_cast<InputAreaObject*>(it);
		this->checkActionArea(area->objPos, area->radius, area->sender, area->method);
	}
			
	this->ReleaseFrame();

}

This code not only handles the position of the sprite, it also sets a different sprite (hand_close.png) if the camera detects that the hand is less than 20% open. In addition to this, there is simple logic to create hand precision, which makes the user input more sensitive and easier to get the edges of the screen. We do this because the Perceptual Camera is not that precise on the edges, and the position of the sprites commonly get crazy when we approach the edge.

Now it is indispensable that we add some ways to handle the input (a closed hand is considered a touch). We need to write a method called “checkActionArea” (called in the update method) and register the actionArea.

void CameraManager::checkActionArea(CCPoint objPos, float radius, CCObject* sender, SEL_CallFuncO methodToCall){

	if (sender==NULL)
		sender = this->parent;

	float distanceTargetToHand = ccpDistance(this->hand1Pos, objPos);
	if (distanceTargetToHand<radius){
		if (this->hand1Close&& !hasClickedHand1){

			this->parent->runAction(CCCallFuncO::create(this->parent, methodToCall, sender));
			hasClickedHand1 = true;
		}
	}
	
	if (!this->hand1Close){
		hasClickedHand1 = false;
	} //TODO: repeat for hand2
}

Follow the method registerActionArea() for the registration of areas:

void CameraManager::registerActionArea(CCPoint objPos, float radius, cocos2d::SEL_CallFuncO methodToCall){

	InputAreaObject* newInputArea = new InputAreaObject(objPos, radius, methodToCall);
	this->inputAreas->addObject(newInputArea);
}

Now it is easy to add the Intel Perceptual Computing SDK to your Cocos2D game!!! Just run:

CameraManager::getInstance()->Start(this);

When entering the Layer, register the objects and methods to be called:

CameraManager::getInstance()->registerActionArea(btn_exit->getPosition(), 150, callfuncO_selector(LevelSelectionScene::backClicked));

About us!

We hope you have liked our short tutorial. Feel free to contact us with any issues or questions!

Naked Monkey Games is an indie game studio located at São Paulo, Brazil currently part of the Cietec Incubator. It partners with Intel on new and exciting technology projects!

Please follow us on Facebook (www.nakedmonkey.mobi) and Twitter (www.twitter.com/nakedmonkeyG).

Intel® Perceptual Computing Challenge

Cross-Platfor Development

Microsoft Windows* 8 Desktop

↧

Intel® Graphics Performance Analyzers (Intel® GPA) FAQ

December 9, 2013, 1:03 pm

Latest and popular articles on Intel Technologies

≫ Next: Shadow Mapping Algorithm for Android*

≪ Previous: How to Integrate Intel® Perceptual Computing SDK with Cocos2D-x

Intel® Graphics Performance Analyzers (Intel® GPA) FAQ

Table of Contents

General Product Information

Using Intel® GPA

Technical Requirements

Product Support

General Product Information

Q: What is Intel GPA, and what do I use it for?
A: Intel GPA is a powerful, agile tool suite enabling game developers to utilize the full performance potential of their gaming platform, including (though not limited to) Intel® Core^TM and Intel® HD Graphics, as well as Intel phones running the Android* OS. Intel GPA visualizes performance data from your application, enabling you to understand system-level and individual frame performance issues, as well as allowing you to perform 'what-if' experiments to estimate potential performance gains from optimizations.

Q: Which platforms does Intel GPA support?
A: Intel GPA supports multiple analysis and target operating systems. Here is a table summarizing the supported platforms:

Target Platform (where your game runs)	Client/Analysis Platform (your development system)	Target Graphics API
Microsoft* Windows* 7 (x64 only) OS	Microsoft* Windows* 7/8/8.1 OS	Microsoft* DirectX* 9/9Ex, 10.0/10.1, 11.0
Microsoft* Windows* 8/8.1 (x64 only) OS	Microsoft* Windows* 7/8/8.1 OS	Microsoft* DirectX* 9/9Ex, 11.0 Windows* 8/8.1 Store Applications
Google* Android* 4.0, 4.1, 4.2 (limited to Intel® Atom™ based phones)	Microsoft* Windows* 7/8/8.1 OS Apple* OS X* 10.7, 10.8 Ubuntu* OS 11.10, 12.04	OpenGL*-ES 1.0, 2.0

See the Intel GPA Release Notes for detailed product information on each of these platforms.

Q: Is Intel GPA really free?
A: The product is available at no charge for our valued development community -- to download Intel GPA visit the Intel GPA Home Page.

Q: How does Intel GPA compare with other Intel products such as Intel® VTune^TM Performance Analyzer and Intel® Parallel Studio, and how do I select the right one for my analysis/optimization needs?
A: Intel GPA offers complementary profiling capabilities to other Intel tools focused on debugging and deep hotspot analysis. Intel GPA can help determine whether potential performance bottlenecks exist, and offers the ability to perform "what if" experiments to help optimize the graphics portion of your application. For even deeper performance analysis you can use Intel VTune Amplifier XE with Intel GPA to fine-tune games and media for optimal performance, ensuring cores are fully exploited and new processor capabilities are supported to the fullest.

Q: What are the key advantages of Intel GPA?
A: Intel has worked extensively with game developers to create a product that precisely meets their needs, so they can quickly optimize games. The key advantages of using Intel GPA are:

Intuitive interface: Quickly find issues, without a lot of clutter; the product's easy workflow fits the way game developers want to optimize their games.
In-depth, real-time analysis: Identify bottlenecks, experiment with changes, and see results in real time - all within Intel GPA and without modifying the game code.
Multiple platform support: Optimize games and graphics-intensive applications for Intel systems utilizing processor graphics, or Intel Atom phones running the Android* OS. When possible, Intel GPA accesses hardware metrics in these devices for more accurate measurements of the game's use of the rendering pipeline.
Task timeline visualization: Use the Intel® GPA Platform Analyzer to see how your task system is balanced (or not) across multiple threads on both the CPU and GPU.

Q: Have developers been able to use Intel GPA to improve the performance of "real world" games?
A: Many developers have utilized Intel GPA to improve game performance on PC's utilizing Intel processor graphics. Many of these games can be found in the Game Gallery. Just a few examples of Intel GPA-enabled titles are Need For Speed:World *, DarkSpore*, LEGO Universe*, Civilization V*, Stalker:Call of Pripyat*, Demigod*, EmpireNapoleon:Total War*, and Ghostbusters, The Video Game*. The performance gains in these games resulted in an increased frame rate and/or additional game features that improved the user experience.

Q: Where do I find out more information about Intel GPA?
A: To find out more about the Intel GPA tool suite, visit the Intel GPA Home Page. The product's home site provides detailed information about the tool, including information on how to download the tool, training and support resources, and videos on the product to help you get started quickly.

Using Intel® GPA

Q: How do I start using Intel GPA?
A: It is pretty easy to get started with Intel GPA -- most game developers start using Intel GPA immediately after installing the package, since Intel GPA uses standard graphics drivers and does not require modifications to your game code (one exception is if you are trying to perform thread-based analysis with the Intel® GPA Platform Analyzer, which requires that you add some code to designate individual threads). To get you up and running quickly, check out the Intel GPA Getting Started Guide, which shows you how to run the main features of the product.

Q: How difficult is it to learn how to use the product?
A: The Intel GPA product features an intuitive user interface that does not require extensive training to quickly access key performance metrics. Therefore, many users will immediately realize many benefits of the product. However, as Intel GPA enables you to perform precise analysis and experiments for every portion of the rendering pipeline, users with a detailed knowledge of Microsoft* DirectX* can quickly utilize even these advanced features.

Q: What kinds of problems can Intel GPA find?
A: If you have performance "hot spots" within your game, Intel GPA can help pinpoint them at the system level, at the frame or sub-frame level, or by visualizing task performance across the CPU/GPU. Once you have identified issues, try different experiments to see the resulting changes in the rendering time as well as the visual effect of these changes. The benefit is that Intel GPA can help improve your frame rate and/or enable you to add new visual effects, while still providing an acceptable level of user interactivity.

Q: How do the Intel GPA System Analyzer and Intel GPA Frame Analyzer help identify optimization opportunities in my game?
A: The Intel GPA System Analyzer application provides access to system-wide metrics for your game, including CPU, GPU, API, and the graphics driver. The metrics available will vary depending upon your platform, but for both Microsoft* Windows* and Google* Android* you will find a large collection of useful metrics to help quantify key aspects of your application's use of system resources. Within the Intel GPA System Analyzer you can also perform various "what-if" experiments to diagnose at a high level where your game's performance bottlenecks are concentrated.

If the Intel GPA System Analyzer finds that your game is CPU-bound, perform additional fine-tuning of your application using one of the Intel performance optimization products such as Intel® Parallel Studio or Intel® VTune^TM Amplifier XE Performance Profiler.
If the Intel GPA System Analyzer finds that you game is GPU-bound, use the Intel GPA Frame Analyzer to drill down within a single graphics frame to pinpoint specific rendering problems, such as texture bandwidth, pixel shader performance, level-of-detail issues, or other bottlenecks within the rendering pipeline. For example, using the "simple pixel shader" experiment to determine whether shader complexity is a bottleneck.

Q: How does the Intel GPA Platform Analyzer help identify optimization opportunities in my game?
A: The Intel GPA Platform Analyzer visualizes the execution profile of the tasks in your code on the entire PC platform over time on both the CPU and GPU. This helps you understand task-based issues within your game, enabling you to optimize the compute and rendering tasks across both the CPU and GPU. The Intel GPA Platform Analyzer uses trace data collected during the application run to provide a detailed analysis of how your code executes across all threads, and correlates the CPU workload with that on the GPU.

Q: Though Intel GPA seems to be targeting game developers, will Intel GPA work with other graphics applications?
A: Intel GPA is primarily designed to solve the performance optimization needs of game developers, and for media application developers. However, the features of Intel GPA are broad-based for use with any visual computing application. In other words, our expectation is that anyone developing graphics applications, both "expert" and "novice" alike, should be able to take advantage of the analysis and optimization capabilities of the product.

Q: Do I have to modify the software for my game, or install special drivers, in order to be able to use Intel GPA?
A: The Intel GPA System Analyzer and the Intel GPA Frame Analyzer tools can analyze your game without any code modifications or special libraries. This is possible because Intel GPA accesses the CPU, driver, DirectX*, and GPU metrics directly from the game environment, a big plus for common analysis tasks. For more complex, task-based analysis with the Intel GPA Platform Analyzer, you will benefit by inserting Intel® Instrumentation and Tracing Technology API calls (ITT) that tag the various tasks in your game code -- but you will have to do this once, as the ITT library used by Intel GPA is also used by a number of other Intel performance analysis tools.

Technical Requirements

Q: What are the Intel® GPA system requirements?
A: As the specific requirements depend both upon your target platform and analysis platform, read the Intel GPA release notes for detailed system requirements.

Q: What graphics devices does Intel® GPA support?
A: When the target platform is Windows* OS, Intel® GPA supports Intel® HD graphics (including Intel® HD Graphics 2000/3000 or later). Although Intel GPA may work with other graphics devices, they are unsupported, and some features and metrics may not be available on unsupported platforms. For the Google* Android* OS, Intel GPA only supports Intel phones based upon the Intel® Atom™ processor. Other Android* phones or tablets are not supported, and you should not attempt to run Intel GPA on these non-supported Android devices. When using Intel GPA in the client/target mode, the minimum requirements of the client system used to analyze either Windows* or Android* workloads are: Intel Core Processor with a minimum of 2GB of memory, though at least 4GB of memory and a 64-bit OS are highly recommended.

Q: Does Intel® GPA work with netbook computers and Intel Ultrabooks™?
A: Yes, Intel® GPA supports many popular netbook computers and Ultrabook systems. However, lower end systems may not have sufficient resources to run the Intel GPA tools. So if you encounter issues when running Intel GPA System Analyzer HUD or Intel GPA Frame Analyzer, use the client/target ("networked") version of these tools with a more powerful client system. A good suggestion for a client system would be a computer running a 64-bit OS with greater than 4GB of memory.

Q: Is Intel® GPA Frame Analyzer supported on Windows* 32-bit platforms?
A: No - frame captures taken on 32-bit Windows* OS can be opened remotely from Intel GPA Frame Analyzer installed on a 64-bit Windows* OS platform. For more information, see Intel® GPA: Limited support for Windows* OS 32-bit platforms.

Product Support

Q: How do I get support for Intel GPA?
A: The primary support model for Intel GPA is through the product's Support Forum and Knowledge Base articles. At the Support Forum you can ask questions about the product, share your experiences with other users of the product, and ask for assistance should you encounter issues when using Intel GPA. The Knowledge Base contains various technical notes, "tips & tricks", training material, and pointers to other information that may be of interest to both novice and experienced users alike.

Q: Will Intel GPA support all future Intel graphics devices?
A: Intel intends to continue offering the tools that enable developers to take the best advantage of Intel graphics devices, both now and into the future. Intel will continue to identify, with close cooperation from developers, the best tools to enable optimization and performance of these devices.

Q: When analyzing DirectX* applications, what should I expect to see if I attempt to use Intel® GPA on non-supported graphics devices?
A: Features and performance will vary based upon the hardware capabilities of these other configurations. For example, on non-supported Windows* target devices the Intel GPA System Analyzer cannot provide many detailed GPU metrics, but many of the Intel GPA Frame Analyzer functions work on any graphics device (though again you will typically see fewer GPU metrics on these devices).

Q: What is your plan for supporting OpenGL* on the Windows* platform?
A: We are actively exploring enhancing Intel GPA to support OpenGL* on the Microsoft* Windows* OS -- if you have specific OpenGL* needs or feature requests, we would love to hear from you at the Intel GPA Support Forum. If you are analyzing Android* applications on Intel phones, the product already supports the Open-GL* API on these platforms.

Q: How do I submit suggestions or feedback to the development team?
A: Use the Intel GPA Support Forum to submit suggestions on new features, and/or to comment on the features currently in the product.

* Other names and brands may be claimed as the property of others.

vcsource_platform_desktoplaptop

vcsource_domain_media

vcsource_os_windows

vcsource_domain_graphics

vcsource_domain_gamedev

vcsource_type_productsample

↧

Shadow Mapping Algorithm for Android*

December 16, 2013, 9:09 am

Latest and popular articles on Intel Technologies

≫ Next: Disabling Tracing of Android* Applications in Intel® GPA

≪ Previous: Intel® Graphics Performance Analyzers (Intel® GPA) FAQ

By Stanislav Pavlov

Downloads

Shadow Mapping Algorithm for Android* [PDF 440KB]

"There is no light without shadows" - Japanese proverb

Because shadows in games make them more realistic and interesting, including well-rendered shadows in your games is important. Currently, most games do not have shadows, but this situation is changing. In this paper we will discuss a common method for realizing shadows, called Shadow Mapping.

Shadow Mapping Theory

Shadow mapping is one of the most conventional techniques for shadow generation in real-time applications. The method is based on the observation that whatever can be seen from the position of the light source is lit; the rest is in the shade. The principle of this method consists in comparing the depth of the current fragment in the reference system associated with the light source to that which is closest to the light source geometry.

The algorithm consists of just two stages:

1. The shadow map generation
2. The rendering stage

The algorithm’s main advantage is that is it easy to understand and implement. Its disadvantages include that it requires more CPU and GPU resources and calculations to make a picture more real. As a result of these additional resources, the shadow map in the depth buffer could become slower.

Algorithm Realization

To create a shadow map, it is necessary to render the scene from the position of the light source. Thus, we obtain the shadow map in the depth buffer, which contains depth values closest to the light source geometry. This approach has the advantage of speed, since the depth buffer generation algorithm is implemented in the hardware.

At the final stage, rendering occurs from the camera position. Each point of the scene is translated into the coordinate system of the light source, and then we calculate the distance from this point to the light source. Calculated distance is compared with the value that is stored in the shadow map. If the distance from the point to the light source is more than the value stored in the shadow map, then this point is in the shadow of any object placed in the path of light.

The code in this article uses the Android SDK (ver. 20) and the Android NDK (ver 8d). It is taken as the basis for a fully native application: http://developer.android.com/reference/android/app/NativeActivity.html

The Android MegaFon Mint* smartphone is based on the Intel® Atom™ processor Z2460: http://download.intel.com/newsroom/kits/ces/2012/pdfs/AtomprocessorZ2460.pdf

Initialization

The shadow map is stored in a separate texture format GL_DEPTH_COMPONENT, size 512x512 (shadowmapSize.x = shadowmapSize.y = 512), 32 bits per texel (GL_UNSIGNED_INT). In order to optimize, you can use 16 bit textures (GL_UNSIGNED_SHORT). Creating a texture is possible on devices supporting GL_OES_depth_texture [for documentation see http://www.khronos.org/registry/gles/extensions/OES/OES_depth_texture.txt].

The parameters of GL_TEXTURE_WRAP_S and GL_TEXTURE_WRAP_T are set in GL_CLAMP_TO_EDGE. So when you request any value outside the texture sampling mechanism (sampler), a value corresponding to the boundary is returned. This is done to reduce artifacts from the shadows in the final rendering stage. "Tricks with the fields" will be discussed in another blog.

        //Create the shadow map texture
	glGenTextures(1, &m_textureShadow);
	glBindTexture(GL_TEXTURE_2D, m_textureShadow);
	checkGlError("bind texture");
	// Create the depth texture.
	glTexImage2D(GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT, shadowmapSize.x, shadowmapSize.y, 0, GL_DEPTH_COMPONENT, GL_UNSIGNED_INT, NULL);
	checkGlError("image2d");
	// Set the textures parameters
	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
	// create frame buffer object for shadow pass
	glGenFramebuffers(1, &m_fboShadow);
	glBindFramebuffer(GL_FRAMEBUFFER, m_fboShadow);
	glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_2D, m_textureShadow, 0);
	checkGlError("shadowmaptexture");
	status = glCheckFramebufferStatus(GL_FRAMEBUFFER);
	if(status != GL_FRAMEBUFFER_COMPLETE) {
		LOGI("init: ");
		LOGI("failed to make complete framebuffer object %xn", status);
	}
	glBindFramebuffer(GL_FRAMEBUFFER, 0);

The next initialization phase is the preparation of shaders.

Below is the Vertex shader attribute. This is the next step of generating shadow maps:

vec3 Position;

uniform mat4 Projection;
uniform mat4 Modelview;

void main(void)
{
	gl_Position = Projection * Modelview * vec4(Position, 1);
}

Pixel shader (step shadow generation):
highp vec4 Color = vec4(0.2, 0.4, 0.5, 1.0);

void main(void)
{
	gl_FragColor = Color;
}

The main task of shaders is to write the geometry, or in other words, to generate the depth buffer for the main stage.

Stages of shadow map rendering

These steps differ from the usual stages of the vectorization scene by the next few points:

FBO, which acts as our depth buffer, is attached to the texture (shadow map) glBindFramebuffer (GL_FRAMEBUFFER, m_fboShadow).
You can render shadows using orthographic projection from directional sources (the sun), or from a conical (omni) perspective. In the example, the chosen perspective projection matrix lightProjectionMatrix has a wide viewing angle—90 degrees.
The color entry in the frame buffer is from the glColorMask (GL_FALSE, GL_FALSE, GL_FALSE, GL_FALSE). This optimization can be very useful if you use a complex pixel shader.
At this stage, the map is drawn only for the rear surface of the polygons, glCullFace (GL_FRONT). This is one of the most effective and easiest methods to reduce the negative effects of artifacts on the shadow map method. (Note: this is not useful for all geometries.)
Area will draw 1 pixel on each side is smaller than the shadow map glViewport ( 0, 0 , shadowmapSize.x - 2 , shadowmapSize.y - 2). This is done in order to leave the "field" on the shadow map.
After drawing all the elements of the scene, we return to its original value glCullFace (GL_BACK), glBindFramebuffer (GL_FRAMEBUFFER, 0) and glColorMask (GL_TRUE, GL_TRUE, GL_TRUE, GL_TRUE).

void RenderingEngine2::shadowPass() {
	GLenum status;
	glEnable(GL_DEPTH_TEST);
	glBindFramebuffer(GL_FRAMEBUFFER, m_fboShadow);
	status = glCheckFramebufferStatus(GL_FRAMEBUFFER);
	if (status != GL_FRAMEBUFFER_COMPLETE) {
		LOGE("Shadow pass: ");
		LOGE("failed to make complete framebuffer object %xn", status);
	}
	glClear(GL_DEPTH_BUFFER_BIT);
	glColorMask(GL_FALSE, GL_FALSE, GL_FALSE, GL_FALSE);

	lightProjectionMatrix = VerticalFieldOfView(90.0,
			(shadowmapSize.x + 0.0) / shadowmapSize.y, 0.1, 100.0);
	lightModelviewMatrix = LookAt(vec3(0, 4, 7), vec3(0.0, 0.0, 0.0), vec3(0, -7, 4));
	glCullFace(GL_FRONT);
	glUseProgram(m_simpleProgram);
	glUniformMatrix4fv(uniformProjectionMain, 1, 0,
			lightProjectionMatrix.Pointer());
	glUniformMatrix4fv(uniformModelviewMain, 1, 0,
			lightModelviewMatrix.Pointer());
	glViewport(0, 0, shadowmapSize.x - 2, shadowmapSize.y - 2);

	GLsizei stride = sizeof(Vertex);
	const vector& objects = m_Scene.getModels();
	const GLvoid* bodyOffset = 0;
	for (int i = 0; i < objects.size(); ++i) {
		lightModelviewMatrix = objects[i].m_Transform * LookAt(vec3(0, 4, 7), vec3(0.0, 0.0, 0.0), vec3(0, -7, 4));
		glUniformMatrix4fv(uniformModelviewMain, 1, 0, lightModelviewMatrix.Pointer());
		glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, objects[i].m_indexBuffer);
		glBindBuffer(GL_ARRAY_BUFFER, objects[i].m_vertexBuffer);

		glVertexAttribPointer(attribPositionMain, 3, GL_FLOAT, GL_FALSE, stride,
				(GLvoid*) offsetof(Vertex, Position));

		glEnableVertexAttribArray(attribPositionMain);

		glDrawElements(GL_TRIANGLES, objects[i].m_indexCount, GL_UNSIGNED_SHORT,
				bodyOffset);

		glDisableVertexAttribArray(attribPositionMain);
	}
	glColorMask(GL_TRUE, GL_TRUE, GL_TRUE, GL_TRUE);

	glBindFramebuffer(GL_FRAMEBUFFER, 0);
	glCullFace(GL_BACK);
}

Rendering scenes with shadows

The first stage of this specific feature is to set textures with the shadow map obtained in the previous step:

glActiveTexture(GL_TEXTURE0);
	glBindTexture(GL_TEXTURE_2D, m_textureShadow);
	glUniform1i(uniformShadowMapTextureShadow, 0);

void RenderingEngine2::mainPass() {
	glClearColor(0.5f, 0.5f, 0.5f, 1);
	glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
	modelviewMatrix = scale * rotation * translation
			* LookAt(vec3(0, 8, 7), vec3(0.0, 0.0, 0.0), vec3(0, 7, -8));
	lightModelviewMatrix = LookAt(vec3(0, 4, 7), vec3(0.0, 0.0, 0.0), vec3(0, -7, 4));

	projectionMatrix = VerticalFieldOfView(45.0, (screen.x + 0.0) / screen.y, 0.1, 100.0);
	mat4 offsetLight = mat4::Scale(0.5f) * mat4::Translate(0.5, 0.5, 0.5);
	mat4 lightMatrix = lightModelviewMatrix * lightProjectionMatrix	* offsetLight;
	glUseProgram(m_shadowMapProgram);
	glUniformMatrix4fv(uniformLightMatrixShadow, 1, 0, lightMatrix.Pointer());
	glUniformMatrix4fv(uniformProjectionShadow, 1, 0, projectionMatrix.Pointer());
	glUniformMatrix4fv(uniformModelviewShadow, 1, 0, modelviewMatrix.Pointer());

	glViewport(0, 0, screen.x, screen.y);

	glActiveTexture(GL_TEXTURE0);
	glBindTexture(GL_TEXTURE_2D, m_textureShadow);
	glUniform1i(uniformShadowMapTextureShadow, 0);

	GLsizei stride = sizeof(Vertex);
	const vector& objects = m_Scene.getModels();
	const GLvoid* bodyOffset = 0;
	for (int i = 0; i < objects.size(); ++i) {
		modelviewMatrix = scale * rotation * translation * LookAt(vec3(0, 8, 7), vec3(0.0, 0.0, 0.0), vec3(0, 7, -8));
		glUniformMatrix4fv(uniformTransformShadow, 1, 0, objects[i].m_Transform.Pointer());
		glUniformMatrix4fv(uniformModelviewShadow, 1, 0, modelviewMatrix.Pointer());
		glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, objects[i].m_indexBuffer);
		glBindBuffer(GL_ARRAY_BUFFER, objects[i].m_vertexBuffer);

		glVertexAttribPointer(attribPositionShadow, 3, GL_FLOAT, GL_FALSE,
				stride, (GLvoid*) offsetof(Vertex, Position));
		glVertexAttribPointer(attribColorShadow, 4, GL_FLOAT, GL_FALSE, stride,
				(GLvoid*) offsetof(Vertex, Color));
		glVertexAttribPointer(attribNormalShadow, 3, GL_FLOAT, GL_FALSE, stride,
				(GLvoid*) offsetof(Vertex, Normal));
		glVertexAttribPointer(attribTexCoordShadow, 2, GL_FLOAT, GL_FALSE,
				stride, (GLvoid*) offsetof(Vertex, TexCoord));

		glEnableVertexAttribArray(attribPositionShadow);
		glEnableVertexAttribArray(attribNormalShadow);
		glEnableVertexAttribArray(attribColorShadow);
		glEnableVertexAttribArray(attribTexCoordShadow);

		glDrawElements(GL_TRIANGLES, objects[i].m_indexCount, GL_UNSIGNED_SHORT,
				bodyOffset);

		glDisableVertexAttribArray(attribColorShadow);
		glDisableVertexAttribArray(attribPositionShadow);
		glDisableVertexAttribArray(attribNormalShadow);
		glDisableVertexAttribArray(attribTexCoordShadow);
	}
}

The most interesting parts of these rendering shadows are the shaders. Here’s the technique.

Vertex shader (draws shadows):

attribute vec3 Position;
attribute vec3 Normal;
attribute vec4 SourceColor;
attribute vec2 TexCoord;

varying vec4 fColor;
varying vec3 fNormal;
varying vec2 fTexCoord;
varying vec4 fShadowMapCoord;

uniform mat4 Projection;
uniform mat4 Modelview;
uniform mat4 lightMatrix;
uniform mat4 Transform;

void main(void)
{
	fColor = SourceColor;
	gl_Position = Projection * Modelview * Transform * vec4(Position, 1.0);
	fShadowMapCoord = lightMatrix * Transform * vec4(Position, 1.0);
	fNormal = normalize(Normal);
	fTexCoord = TexCoord;
}

The vertex shader, in parallel with its usual work, produces a translation of vertices in the plane of the light source. In this example, the transition to the plane of the light given by the matrix lightMatrix and the result are passed to the pixel shader through fShadowMapCoord.

Pixel shader (draws shadows):

uniform highp sampler2D shadowMapTex;

varying lowp vec4 fColor;
varying lowp vec3 fNormal;
varying highp vec2 fTexCoord;
varying highp vec4 fShadowMapCoord;

highp vec3 Light = vec3(0.0, 4.0, 7.0);
highp vec4 Color = vec4(0.2, 0.4, 0.5, 1.0);

void main(void)
{
	const lowp float fAmbient = 0.4;
	Light = normalize(Light);
	highp float depth = (fShadowMapCoord.z / fShadowMapCoord.w);
	highp float depth_light = texture2DProj(shadowMapTex, fShadowMapCoord).r;
	highp float visibility = depth <= depth_light ? 1.0 : 0.2;
	gl_FragColor = fColor * max(0.0, dot(fNormal, Light)) * visibility;
}

The pixel shader calculates each pixel value depth based on the relative light source and compares it with the value corresponding to it in the depth map. If the value does not exceed the depth of the depth maps, it is visible from the source position; otherwise, it is in the shade. In this example, we change the visual color intensity using the coefficient visibility, but in general, it is a more difficult technique.

About the Authors

Stanislav works in the Software & Service Group at Intel Corporation. He has 10+ years of experience in software development. His main interest is optimization of performance, power consumption, and parallel programming. In his current role as an Application Engineer providing technical support for Intel® processor-based devices, Stanislav works closely with software developers and SoC architects to help them achieve the best possible performance on Intel platforms. Stanislav holds a Master's degree in Mathematical Economics from the National Research University Higher School of Economics.

Iliya, co-author of this blog, is also a Senior Software Engineer in the Software & Service Group at Intel. He is a developer on the Intel® VTune™ Amplifier team. He received a Master’s degree from the Nizhniy Novgorod State Technical University.

vcsource_type_techsample

vcsource_os_windows

vcsource_domain_graphics

↧

Disabling Tracing of Android* Applications in Intel® GPA

December 19, 2013, 12:00 am

Latest and popular articles on Intel Technologies

≫ Next: Game Hero - Um advergame como você nunca viu.

≪ Previous: Shadow Mapping Algorithm for Android*

Intel® GPA 2013 R4 now supports tracing of Android* applications. This feature may cause some overhead during the analysis with Intel® GPA System Analyzer, which may impact the accuracy of some platform metrics displayed. If you do not intend to use trace analysis for Android* applications and would like to avoid the possible overhead, you have an option to disable this feature once Intel GPA is installed.
Note: On Ubuntu* and OS X* analysis systems, tracing of Android* applications is disabled by default.

To disable tracing on a Windows* OS analysis system:

Make sure that Intel GPA System Analyzer is not running.
Locate the GpaSystemAnalyzer.cfg configuration file in the Intel GPA home directory. On Windows* OS, the default location is: C:\Users\<user_ID>\Documents\<GPA_version>\.
The GpaSystemAnalyzer.cfg file appears after the first launch of the Intel GPA System Analyzer. If you do not see this file, run the tool, close it, and try again.
Open the configuration file in a text editor.
In the Android* section of the file, replace ”install_tracing = true,” with ”install_tracing = false,”. Make sure to preserve the comma at the end of the line.
Tracing is disabled until you restore the original setting of the “install_tracing” variable.

Next Steps

For more details on getting started with Intel GPA on the Android* OS, please refer to the product's online help. The Intel GPA home page also contains links to product information, including information about analyzing DirectX* games on the Windows* OS platform.

*Other names and brands may be claimed as the property of others.

Developers

Optimization

↧

Game Hero - Um advergame como você nunca viu.

January 6, 2014, 1:20 pm

Latest and popular articles on Intel Technologies

≫ Next: Game Hero - Um advergame como você nunca viu.

≪ Previous: Disabling Tracing of Android* Applications in Intel® GPA

Texto de Mitikazu Lisboa, CEO - Hive Digital Media.

A Hive Digital Media é a maior desenvolvedora brasileira de games, e parceira de longa data da Intel no desenvolvimento de diversos projetos, mas mesmo assim ficamos surpresos quando tivémos a demanda de criar um advergame que tivesse talvez a missão mais ambiciosa de qualquer projeto desta natureza que eu já houvesse ouvido falar: Juntar todas as eras dos games em um único advergame que funcionaria como um tributo aos games e gamers de todas as gerações, além de promover as vantagens do Ultrabook.

O processo de desenvolvimento do game foi muito agilizado pela solidez da parceira da Hive com a Intel e pelas facilidades do programa voltado para desenvolvedores da mesma, que permitiu um fluxo ágil de informações, bem como a utilização dos recursos específicos do Ultrabook, uma das primeiras guidelines que definimos em conjunto.

A primeira tarefa foi, obviamente, criar um conceito central para o jogo, junto com um storytelling que fizesse sentido. Depois de alguns dias de conversa entre nós, a Intel e a DM9, agência responsável pela campanha de mídia do ultrabook, chegamos ao conceito do Game Hero, onde um personagem, o Jogador, seria aprisionado por um vilão dos jogos clássicos e teria que derrotá-lo para reconstruir seu ultrabook e de quebra, escapar da sua prisão. Tentamos manter um clima nostálgico e leve, e optamos por trazer muitas referências e "easter eggs" de jogos clássicos, que os jogadores mais "old School" com certeza reconhecerão. Outra decisão estratégica foi misturar diversos estilos de jogos em um só para remeter a tendências das épocas que queremos trabalhar, e para isso dividimos o jogo em 4 "mundos" diferentes, sendo que cada um agrupa características de uma das "eras" dos games, indo desde um típico jogo de plataforma dos consoles de 8 bits e chegando em um modernos FPS, acessível apenas para usuários que estivessem jogado em um ultrabook.

É lógico que essas decisões impactam muito no processo de game design e desenvolvimento, principalmente pelas diferentes jogabilidades que deveriam permear a estrutura do jogo, e isso demandou uma equipe multidisciplinar e alguns profissionais que só se envolveram em algumas partes específicas do jogo, onde suas competências eram relevantes para simular o que queríamos naquela "era dos games" específica. Dos mais de 60 colabores da Hive, aproximadamente 15 foram envolvidos direta ou indiretamente no projeto, fazendo dele um dos nossos maiores cases de 2013.

O principal desafio da construção da área de desenvolvimento foi trabalhar em diferentes plataformas para um mesmo conceito. O diferencial do Game Hero desse ponto de vista é que a base do jogo é executada em uma plataforma e os jogos foram gerados com distribuições em outras tecnologias e fazer essa ponte entre menu , jogos e informações de servidor foi um desafio interessante. Alem disso, a variação de mecânicas e implementação de mashups em um mesmo conceito fez com que a arquitetura do jogo se tornasse complexa, diferenciando-se dos advergames tradicionais.

Para o projeto Game Hero foram utilizadas diversas ferramentas de desenvolvimento de plataformas. O menu principal do jogo, o jogo 8 bits e o jogo 16 bits utilizam o framework Flixel, escrito em ActionScript 3. O jogo 32/bits utiliza a engine Unity3d. O menu do jogo faz a transição para os binários desenvolvidos em fiixel e Unity para o mesmo canvas, enquanto o server-side cuida da transição de informações. Alguns plugins como o Tile Ed, Mesh Exploder e Action RPG foram utilizados no jogo 32/64 para auxiliar em alguns mecanismos necessários. Para o mundo moderno, estamos utilizando Unity3d também, com a diferença de ser um standalone que será baixado pelo usuário e instalado no computador pessoal. O resultado final ficou bem sólido, e também oferece um desafio digno de qualquer old school gamer que queira se arriscar em uma aventura pelas diversas eras do game.

Veja abaixo o trailer do Game Hero e entenda porque esse é um projeto que passa por todas as Eras dos Games.

Se interessou? Gostaria de saber mais? Então acesse os links abaixo:

1. Website do GAME HERO
2. Website da Hive Digital Media
3. Desenvolvimento de aplicativos para dispositivos Intel
4. Como se filiar ao programa de software da Intel
5. Ferramentas para jogos e mídia

Intel AppUp® Developers

Partners

Professors

Students

1. Website do GAME HERO
2. Website da Hive Digital Media
3. Desenvolvimento de aplicativos para dispositivos Intel
4. Como se filiar ao programa de software da Intel
5. Ferramentas para jogos e mídia

↧

Game Hero - Um advergame como você nunca viu.

January 8, 2014, 4:55 am

Latest and popular articles on Intel Technologies

≫ Next: Gameplay: Touch controls for your favorite games

≪ Previous: Game Hero - Um advergame como você nunca viu.

Texto de Mitikazu Lisboa (CEO - Hive Digital Media).

Veja abaixo o trailer do Game Hero e entenda porque esse é um projeto que passa por todas as Eras dos Games.

Se interessou? Gostaria de saber mais? Então acesse os links abaixo:

marketing and business

negócios

advertising

Icon Image:

https://gameplay.gestureworks.com/

Microsoft Windows* 8

↧

Gameplay: Touch controls for your favorite games

January 8, 2014, 12:47 pm

Latest and popular articles on Intel Technologies

≫ Next: Criando jogos multi-plataforma com Cocos2d-x

≪ Previous: Game Hero - Um advergame como você nunca viu.

Download Article

Download Gameplay: Touch controls for your favorite games [PDF 703KB]

GestureWorks Gameplay is a revolutionary new way of interacting with popular PC games. Gameplay software for Windows 8 lets gamers use and build their own Virtual Controllers for touch, which are overlaid on top of existing PC games. Each Virtual Controller overlay adds buttons, gestures, and other controls that are mapped to input the game already understands. In addition, gamers can use hundreds of personalized gestures to interact on the screen. Ideum’s collaboration with Intel gave them access to technology and engineering resources to make the touch overlay in Gameplay possible.

Check out this one-minute video that explains the Gameplay concept.

It’s all about the virtual controllers

Unlike traditional game controllers, virtual controllers can be fully customized and gamers can even share them with their friends. Gameplay works on Windows 8 tablets, Ultrabooks, 2-in-one laptops, All-In-Ones, and even multitouch tables and large touch screens.

Figure 1 - Gameplay in action on Intel Atom-based tablet

"The Virtual Controller is real! Gameplay extends hundreds of PC games that are not touch-enabled and it makes it possible to play them on a whole new generation of portable devices, " says Jim Spadaccini, CEO of Ideum, makers of GestureWorks Gameplay. "Better than a physical controller, Gameplay’s Virtual Controllers are customizable and editable. We can’t wait to see what gamers make with Gameplay."

Figure 2 - The Home Screen in Gameplay

Several dozen pre-built virtual controllers for popular Windows games come with GestureWorks Gameplay (currently there are over 116 unique titles). Gameplay lets users configure, layout, and customize existing controllers as well. The software also includes an easy to use, drag-and-drop authoring tool allowing users to build their own virtual controller for many popular Windows-based games distributed on the Steam service.

Figure 3 - Virtual Controller layout view

Users can place joysticks, D-pads, switches, scroll wheels, and buttons anywhere on the screen, change the size, opacity, and add colors and labels. Users can also create multiple layout views which can be switched in game at any time. This allows a user to create unique views for different activities in game, such as combat versus inventory management functions in a Role Playing Game.

Figure 4 - Virtual Controller Global Gestures View

Powered by the GestureWorks gesture-processing engine aka GestureWorks Core, Gameplay provides support for over 200 global gestures. Basic global gestures such as tap, drag, pinch/zoom, and rotate are supported by default, but are also customizable. This allows extension of overlaid touch controllers, giving gamers access to multi-touch gestures that can provide additional controls to PC games. For example, certain combat moves can be activated with a simple gesture versus a button press in a FPS. Gameplay even includes experimental support for accelerometers so you can steer in a racing game by tilting your Ultrabook™ or tablet, and it detects when you change your 2-1 device to tablet mode to optionally turn on the virtual controller overlay.

Challenges Addressed During Development

Developing all this coolness was not easy, to make the vision for Gameplay a reality, several technical challenges had to be overcome. Some of these were solved using traditional programming methods, while others required more innovative solutions.

DLL injection

DLL injection is a method used for executing code within the address space of another process by getting it to load an external dynamically-linked library. While DLL injection is often used by external programs for nefarious reasons, there are many legitimate uses for it, including extending the behavior of a program in a way its authors did not anticipate or originally intend. With Gameplay, we needed a method to insert data into the input thread of the process (game) being played so the touch input could be translated to inputs the game understood. Of the myriad methods for implementing DLL injection, Ideum chose to use the Windows hooking calls in the SetWindowsHookEx API. Ultimately, Ideum opted to use process-specific hooking versus global hooking for performance reasons.

Launching games from a third-party launcher

Two methods of hooking into a target processes address space were explored. The application can hook into a running process’ address space, or the application can launch the target executable as a child process. Both methods are sound; however, in practice, it is much easier to monitor and intercept processes or threads created by the target process when the application is a parent of the target process.

This poses a problem for application clients, such as Steam and UPlay, that are launched when a user logs in. Windows provides no guaranteed ordering for startup processes, and the Gameplay process must launch before these processes to properly hook in the overlay controls. Gameplay solves this issue by installing a lightweight system service during installation that monitors for startup applications when a user logs in. When one of the client applications of interest starts, Gameplay is then able to hook in as a parent to the process insuring the overlay controls are displayed as intended.

Lessons Learned

Mouse filtering

During development, several game titles were discovered that incorrectly processed virtual mouse input received from the touch screen. This problem largely manifested with First Person Shooter titles or Role Playing Titles that have a "mouse-look" feature. The issue was that the mouse input received from the touch panel was absolute with respect to a point on the display, and thus in the game environment. This made the touch screen almost useless as a "mouse-look" device. The eventual fix was to filter out the mouse inputs by intercepting the input thread for the game. This allowed Gameplay to emulate mouse input via an on-screen control such as a joystick for the "mouse-look" function. It took a while to tune the joystick responsiveness and dead zone to feel like a mouse, but once that was done, everything worked beautifully. You can see this fix in action on games like Fallout: New Vegas or The Elder Scrolls: Skyrim.

Vetting titles for touch gaming

Ideum spent significant amounts of time tuning the virtual controllers for optimal gameplay. There are several elements of a game that determine its suitability for using with Gameplay. Below are some general guidelines that were developed for what types of games work well with Gameplay:

Gameplay playability by game type

Good	Better	Best
Role Playing Games (RPG)	Simulation Fighting Sports Racing Puzzles Real Time Strategy (RTS) Third Person Shooters	Platformers Side Scrollers Action and Adventure

Good

Better

Best

Role Playing Games (RPG)

Simulation

Fighting

Sports

Racing

Puzzles

Real Time Strategy (RTS)

Third Person Shooters

Platformers

Side Scrollers

Action and Adventure

While playability is certainly an important aspect of vetting a title for use with Gameplay, the most important criteria is stability. Some titles will just not work with either the hooking technique, input injection, or overlay technology. This can happen for a variety of reasons, but most commonly is due to the game title itself monitoring its own memory space or input thread to check for tampering. While Gameplay itself is a completely legitimate application, it employs techniques that can also be used for the forces of evil, so unfortunately some titles that are sensitive to these techniques will never work unless enabled for touch natively.

User Response

While still early in its release, Gameplay 1.0 has developed some interesting user feedback in regards to touch gaming on a PC. There are already some clear trends to the user feedback being received. At a high-level, it is clear that everyone universally loves being able to customize the touch interface for games. The remaining feedback focuses on personalizing the gaming experience in a few key areas:

Many virtual controllers are not ideal for left handed people, this was an early change to many of the published virtual controllers.
Button size and position is the most common change, so much so, that Ideum is considering adding an automatic hand sizing calibration in a future Gameplay release.
Many users prefer rolling touch inputs vs. discrete touch and release interaction.

We expect many more insights to reveal themselves as the number of user created virtual controllers increases.

Conclusion

GestureWorks Gameplay brings touch controls to your favorite games. It does this via a combination of a visual overlay and supports additional interactions like gesture, accelerometers, and 2-1 transitions. What has been most interesting in working on this project has been the user response. People are genuinely excited about touch-gaming on PCs, and ecstatic they can now play many of the titles they previously enjoyed with touch.

About Erik

Erik Niemeyer is a Software Engineer in the Software & Solutions Group at Intel Corporation. Erik has been working on performance optimization of applications running on Intel microprocessors for nearly fifteen years. Erik specializes in new UI development and micro-architectural tuning. When Erik is not working he can probably be found on top of a mountain somewhere. Erik can be reached at erik.a.niemeyer@intel.com.

About Chris

Chris Kirkpatrick is a software applications engineer working in the Intel Software and Services Group supporting Intel graphics solutions on mobile platforms in the Visual & Interactive Computing Engineering team. He holds a B.Sc. in Computer Science from Oregon State University. Chris can be reached at chris.kirkpatrick@intel.com.

Resources

http://software.intel.com/en-us/articles/detecting-slateclamshell-mode-screen-orientation-in-convertible-pc

Intel, the Intel logo, and Ultrabook are trademarks of Intel Corporation in the U.S. and/or other countries.

*Other names and brands may be claimed as the property of others.

ideum

GestureWorks; Ultrabook

Laptop

Tablet

↧

Criando jogos multi-plataforma com Cocos2d-x

January 10, 2014, 7:49 am

Latest and popular articles on Intel Technologies

≫ Next: How To Plan Optimizations with Unity*

≪ Previous: Gameplay: Touch controls for your favorite games

Neste tutorial iremos demonstrar como criar um jogo simples usando o framework Cocos2d-x em um ambiente de desenvolvimento Windows* e como compilar o jogo em Windows e Android.

O que é o Cocos2d-x?

O Cocos2d-x é um framework multi-plataforma de desenvolvimento de jogos (e outros apps gráficos como livros interativos) baseado no cocos2d para iOS, mas que usa C++, Javascript ou Lua ao invés do Objective-C.

Uma das vantagens desse framework é permitir o desenvolvimento de jogos para diversas plataformas (Android*, iOS*, Win32*, Windows* Phone, Windows* 8, Mac*, Linux*, etc) mantendo apenas uma base de código (com algumas adaptações sendo feitas de uma plataforma para outra).

Ele é open-source sob a licença MIT License e seu código fonte pode ser encontrado aqui.

Para mais informações sobre o Cocos2d-x e sua documentação, acesse: http://www.cocos2d-x.org/

Criando seu primeiro game

1- Faça o download da versão mais recente do framework no site e descompacte no seu ambiente de desenvolvimento. Para este tutorial foi utilizada a versão 2.2.2 e o framework foi descompactado na área de trabalho (C:\Users\felipe.pedroso\Desktop\cocos2d-x-2.2)

2- Para criar um novo projeto no cocos2d-x, iremos usar um script python (create_project.py) que cria toda a estrutura do projeto dentro da pasta aonde o framework foi descompactado. Caso não tenha o runtime do Python instalado, faça o download da versão 2.7.6 no seguinte link: http://www.python.org/download/.

3- Abra o prompt de comando (cmd.exe) e execute os seguintes comandos:

- Navegue até a pasta do script (é importante que o script create_project.py seja executado de dentro da pasta project-creator)

cd C:\Users\felipe.pedroso\Desktop\cocos2d-x-2.2\tools\project-creator

- Execute o script com seguinte comando:

   python create_project.py -project MyFirstGame -package com.example.myfirstgame -language cpp

Explicando os parâmetros:
project: Nome do seu projeto/game
package: Nome do pacote da sua aplicação (ex.: br.suaempresa.MyFirstGame)
language: Linguagem de programação a ser utilizada (cpp, lua e javascript)

Obs.: para executar o comando 'python' na linha de comando, adicione a pasta aonde o Python foi instalado na variável de ambiente Path.

- Se tudo funcionar corretamente, seu projeto será criado na pasta projects, dentro do diretório onde o framework foi descompactado.

O projeto criado contém o código base do jogo (Classes), os recursos (imagens, áudio, etc) e projetos para cada uma das plataformas suportadas pelo framework.

Compilando para Win32 (Windows* 7 ou Windows* 8 modo desktop)

Requisitos:

Visual Studio 2012

Dentro do diretório do projeto, abra o arquivo MyFirstGame.sln que está na pasta proj.win32 com o Visual Studio.

Compile o projeto pressionando F6 (ou menu Build -> Build Solution) e execute o projeto pressionando F5 (ou menu Debug->Start Debugging).

Se tudo der certo, você verá a seguinte janela:

Compilando para Windows* 8 (Windows* Store App)

Requisitos:

Visual Studio 2012

Para compilar o projeto como uma Windows* Store App, abra o arquivo MyFirstGame.sln da pasta proj.winrt e compile da mesma maneira do projeto Win32.

Após compilar e rodar, você verá a seguinte tela:

Obs.: a versão do cocos2d-x utilizada no tutorial não funcionou no Windows* 8.1

Compilando para Android*

Requisitos

Da mesma forma que o Python foi adicionado no Path do Windows*, adicione os diretórios tools e platform-tools do Android SDK,o diretório raiz do NDK e o diretório bin do Apache Ant para poder usar os comandos para fazer build do app.

Abra um novo prompt de comando (cmd.exe) e execute os seguintes comandos para configurar as variáveis de ambiente necessárias para a compilação do app Android:

set COCOS2DX_ROOT=C:\Users\felipe.pedroso\Desktop\cocos2d-x-2.2
  set NDK_TOOLCHAIN_VERSION=4.8
  set NDK_MODULE_PATH=%COCOS2DX_ROOT%;%COCOS2DX_ROOT%\cocos2dx\platform\third_party\android\prebuilt

Explicando as variáveis:

  COCOS2DX_ROOT: diretório aonde o cocos2d-x foi extraído
  NDK_TOOLCHAIN_VERSION: versão do toolchain do NDK que será utilizado para compilar o projeto
  NDK_MODULE_PATH: Módulos que devem ser incluídos na compilação do NDK. No caso do exemplo, estamos utilizando os módulos pré-compilados do cocos2d-x

Com as variáveis de ambiente configuradas, navegue até o diretório do projeto:

cd C:\Users\felipe.pedroso\Desktop\cocos2d-x-2.2\projects\MyFirstGame\proj.android

Copie os resources do jogo (imagens, sons, etc) para a pasta assets:

  rmdir /S /Q assets
  mkdir assets
  xcopy /E ..\Resources .\assets

Execute o seguinte comando para compilar os módulos nativos:

ndk-build.cmd -C . APP_ABI="armeabi armeabi-v7a x86"

O comando irá gerar as bibliotecas nativas para três arquiteturas diferentes: ARM, ARM-NEON e x86. Isso permitirá que seu jogo rode em diversas arquiteturas aproveitando o que elas oferecem de melhor. .Após finalizar a compilação dos módulos nativos, compile o app Android com o comando ant:

ant debug

Agora para instalar o aplicativo em um device ou emulador use o comando:

adb install -r bin\MyFirstGame.apk

Depois é só abrir o app:

Pronto, agora seu jogo já pode rodar em pelo menos três plataformas: Android* (em três arquiteturas), Windows* 7 e Windows* 8!

Até a próxima!

* Other names and brands may be claimed as the property of other

cocos2d-x

mobile game development