Category robots in business

Page 431 of 431
1 429 430 431

Developing ROS programs for the Sphero robot

You probably know the Sphero robot. It is a small robot with the shape of a ball. In case that you have one, you must know that it is possible to control it using ROS, by installing in your computer the Sphero ROS packages developed by Melonee Wise and connecting to the robot using the bluetooth of the computer.

Now, you can use the ROS Development Studio to create ROS control programs for that robot, testing as you go by using the integrated simulation.

The ROS Development Studio (RDS) provides off-the-shelf a simulation of Sphero with a maze environment. The simulation provides the same interface as the ROS module created by Melonee, so you can test your develop and test the programs on the environment, and once working properly, transfer it to the real robot.

We created the simulation to teach ROS to the students of the Robot Ignite Academy. They have to learn ROS enough to make the Sphero get out of the maze by using odometry and IMU.

Using the simulation

To use the Sphero simulation on RDS go to rds.theconstructsim.com and sign in. If you select the Public simulations, you will quickly identify the Sphero simulation.

Press the red Play button. A new screen will appear giving you details about the simulation and asking you which launch file you want to launch. The main.launch selected by default is the correct one, so just press Run.

After a few seconds the simulation will appear together with the development environment for creating the programs for Sphero and testing them.

On the left hand side you have a notebook containing information about the robot and how to program it with ROS. This notebook contains just some examples, but it can be completed and/or modified at your will. As you can see it is an iPython notebook and follows its standard. So it is up to you to modify it, add new information or else. Remember that any change you do to the notebook will be saved in a simulation in your private area of RDS, so you can come back later and launch it with your modifications.

You must know that the code included in the notebook is directly executable by selecting the cell of the code (do a single click on it) and pressing the small play button at the top of the notebook. This means that, once you press that button, the code will be executed and start controlling the Sphero simulated robot for a few time-steps (remember to have the simulation activated (Play button of the simulation activated) to see the robot move).

On the center area, you can see the IDE. It is the development environment for developing the code. You can browse there all the packages related to the simulation or any other packages that you may create.

On the right hand side, you can see the simulation and beneath it, the shell. The simulation shows the Sphero robot as well as the environment of the maze. On the shell, you can issue commands in the computer that contains the simulation of the robot. For instance, you can use the shell to launch the keyboard controller and move the Sphero around. Try typing the following:

  • $ roslaunch sphero_gazebo keyboard_teleop.launch

Now you must be able to move the robot around the maze by pressing some keys of the keyboard (instructions provided on the screen).

You can also launch there Rviz, and then watch the robot, the frames and any other additional information you may want of the robot. Type the following:

  • $ rosrun rviz rviz

Then press the Screen red icon located at the bottom-left of the screen (named the graphical tools). A new tab should appear, showing how the Rviz is loading. After a while, you can configure the Rviz to show the information you desire.

There are many ways you can configure the screen to provide more focus to what interests you the most.

To end this post, I would like to indicate that you can download the simulation to your computer at any time, by doing right-click on the directories and selecting Download. You can also clone the The Construct simulations repository to download it (among other simulations available).

If you liked this tutorial, you may also enjoy these:

See all the latest robotics news on Robohub, or sign up for our weekly newsletter.

Split-second decisions: Navigating the fine line between man and machine

Level 3 automation, where the car handles all aspects of driving with the driver on standby, is being tested in Sweden. Image courtesy of Volvo cars

Today’s self-driving car isn’t exactly autonomous – the driver has to be able to take over in a pinch, and therein lies the roadblock researchers are trying to overcome. Automated cars are hurtling towards us at breakneck speed, with all-electric Teslas already running limited autopilot systems on roads worldwide and Google trialling its own autonomous pod cars.

However, before we can reply to emails while being driven to work, we have to have a foolproof way to determine when drivers can safely take control and when it should be left to the car.

‘Even in a limited number of tests, we have found that humans are not always performing as required,’ explained Dr Riender Happee, from Delft University of Technology in the Netherlands, who is coordinating the EU-funded HFAuto project to examine the problem and potential solutions.

‘We are close to concluding that the technology always has to be ready to resolve the situation if the driver doesn’t take back control.’

But in these car-to-human transitions, how can a computer decide whether it should hand back control?

‘Eye tracking can indicate driver state and attention,’ said Dr Happee. ‘We’re still to prove the practical usability, but if the car detects the driver is not in an adequate state, the car can stop in the safety lane instead of giving back control.’

Next level

It’s all a question of the level of automation. According to the scale of US-based standards organisation SAE International, Level 1 automation already exists in the form of automated braking and self-parking.

Level 4 & 5 automation, where you punch in the destination and sit back for a nap, is still on the horizon.

But we’ll soon reach Level 3 automation, where drivers can hand over control in situations like motorway driving and let their attention wander, as long as they can safely intervene when the car asks them to.

HFAuto’s 13 PhD students have been researching this human-machine transition challenge since 2013.

Backed with Marie Skłodowska-Curie action funding, the students have travelled Europe for secondments, to examine carmakers’ latest prototypes, and to carry out simulator and on-road tests of transition takeovers.

Alongside further trials of their transition interface, HFAuto partner Volvo has already started testing 100 highly automated Level 3 cars on Swedish public roads.

Another European research group is approaching the problem with a self-driving system that uses external sensors together with cameras inside the cab to monitor the driver’s attentiveness and actions.

Blink

‘Looking at what’s happening in the scene outside of the cars is nothing without the perspective of what’s happening inside the car,’ explained Dr Oihana Otaegui, head of the Vicomtech-IK4 applied research centre in San Sebastián, Spain.

She coordinates the work as part of the EU-funded VI-DAS project. The idea is to avoid high-risk transitions by monitoring factors like a driver’s gaze, blinking frequency and head pose — and combining this with real-time on-road factors to calculate how much time a driver needs to take the wheel.

Its self-driving system uses external cameras as affordable sensors, collecting data for the underlying artificial intelligence system, which tries to understand road situations like a human would.

VI-DAS is also studying real accidents to discern challenging situations where humans fail and using this to help train the system to detect and avoid such situations.

The group aims to have its first interface prototype working by September, with iterated prototypes appearing at the end of 2018 and 2019.

Dr Otaegui says the system could have potential security sector uses given its focus on creating artificial intelligence perception in any given environment, and hopes it could lead to fully automated driving.

‘It could even go down the path of Levels 4 and 5, depending on how well we can teach our system to react — and it will indeed be improving all the time we are working on this automation.’

The question of transitions is so important because it has an impact on liability – who is responsible in the case of an accident.

It’s clear that Level 2 drivers can be held liable if they cause a fender bender, while carmakers will take the rap once Level 4 is deployed. However, with Level 3 transitions, liability remains a burning question.

HFAuto’s Dr Happee believes the solution lies in specialist insurance options that will emerge.

‘Insurance solutions are expected (to emerge) where a car can be bought with risk insurance covering your own errors, and those which can be blamed on carmakers,’ he said.

Yet it goes further than that. Should a car choose to hit pedestrians in the road, or swerve into the path of an oncoming lorry, killing its occupants?

‘One thing coming out of our discussions is that no one would buy a car which will sacrifice its owner for the lives of others,’ said Dr Happee. ‘So it comes down to making these as safe as possible.’

The five levels of automation:

  1. Driver Assistance: the car can either steer or regulate speed on its own.
  2. Partial Automation: the vehicle can handle both steering and speed selection on its own in specific controlled situations, such as on a motorway.
  3. Conditional Automation: the vehicle can be instructed to handle all aspects of driving, but the driver needs to be on standby to intervene if needed.
  4. High Automation: the vehicle can be instructed to handle all aspects of driving, even if the driver is not available to intervene.
  5. Level 5 – Full Automation: the vehicle handles all aspects of driving, all the time.

Robotic science may (or may not) help us keep up with the death of bees

Credit: SCAD

Beginning in 2006 beekeepers became aware that their honeybee populations were dying off at increasingly rapid rates. Scientists are also concerned about the dwindling populations of monarch butterflies. Researchers have been scrambling to come up with explanations and an effective strategy to save both insects or replicate their pollination functions in agriculture.

Photo: SCAD

Although the Plan Bee drones pictured above are just one SCAD (Savannah College of Art and Design) student’s concept for how a swarm of drones could handle pollinating an indoor crop, scientists are considering different options for dealing with the crisis, using modern technology to replace living bees with robotic ones.Researchers from the Wyss Institute and the School of Engineering and Applied Sciences at Harvard introduced the first RoboBees in 2013, and other scientists around the world have been researching and designing their solutions ever since.

Honeybees pollinate almost a third of all the food we consume and, in the U.S., account for more than $15 billion worth of crops every year. Apples, berries, cucumbers and almonds rely on bees for their pollination. Butterflies also pollinate, but less efficiently than bees and mostly they pollinate wildflowers.

The National Academy of Sciences said:

“Honey bees enable the production of no fewer than 90 commercially grown crops as part of the large, commercial, beekeeping industry that leases honey bee colonies for pollination services in the United States.

Although overall honey bee colony numbers in recent years have remained relatively stable and sufficient to meet commercial pollination needs, this has come at a cost to beekeepers who must work harder to counter increasing colony mortality rates.”

Florida and California have been hit especially hard by decreasing bee colony populations. In 2006, California produced nearly twice as much honey as the next state. But in 2011, California’s honey production fell by nearly half. The recent severe drought in California has become an additional factor driving both its honey yield and bee numbers down as less rain means less flowers available to pollinate.

In the U.S., the Obama Administration created a task force which developed The National Pollinator Health Strategy plan to:

  • Restore honey bee colony health to sustainable levels by 2025.
  • Increase Eastern monarch butterfly populations to 225 million butterflies by year 2020.
  • Restore or enhance seven million acres of land for pollinators over the next five years.

For this story, I wrote to the EPA specialist for bee pollination asking whether funding was continuing under the Trump Administration or whether the program itself was to be continued. No answer.

Japan’s National Institute of Advanced Industrial Science and Technology scientists have invented a drone that transports pollen between flowers using horsehair coated in a special sticky gel. And scientists at the Universities of Sheffield and Sussex (UK) are attempting to produce the first accurate model of a honeybee brain, particularly those portions of the brain that enable vision and smell. Then they intend to create a flying robot able to sense and act as autonomously as a bee.

Bottom Line:

As novel and technologically interesting as these inventions may be, the metrics will need to be near to the present costs of pollination. Or, as biologist Dave Goulson said to a Popular Science reporter, “Even if bee bots are really cool, there are lots of things we can do to protect bees instead of replacing them with robots.”

Saul Cunningham, of the Australian National University, confirmed that sentiment by showing that today’s concepts are far from being economically feasible:

“If you think about the almond industry, for example, you have orchards that stretch for kilometres and each individual tree can support 50,000 flowers,” he says. “So the scale on which you would have to operate your robotic pollinators is mind-boggling.”

“Several more financially viable strategies for tackling the bee decline are currently being pursued including better management of bees through the use of fewer pesticides, breeding crop varieties that can self-pollinate instead of relying on cross-pollination, and the use of machines to spray pollen over crops.”

Beyond 5G: NSF awards $6.1 million to accelerate advanced wireless research

The National Science Foundation (NSF) announced a $6.1 million, five-year award to accelerate fundamental research on wireless communication and networking technologies through the foundation’s Platforms for Advanced Wireless Research (PAWR) program.

Through the PAWR Project Office (PPO), award recipients US Ignite, Inc. and Northeastern University will collaborate with NSF and industry partners to establish and oversee multiple city-scale testing platforms across the United States. The PPO will manage nearly $100 million in public and private investments over the next seven years.

“NSF is pleased to have the combined expertise from US Ignite, Inc. and Northeastern University leading the project office for our PAWR program,” said Jim Kurose, NSF assistant director for Computer and Information Science and Engineering. “The planned research platforms will provide an unprecedented opportunity to enable research in faster, smarter, more responsive, and more robust wireless communication, and move experimental research beyond the lab — with profound implications for science and society.”

Over the last decade, the use of wireless, internet-connected devices in the United States has nearly doubled. As the momentum of this exponential growth continues, the need for increased capacity to accommodate the corresponding internet traffic also grows. This surge in devices, including smartphones, connected tablets and wearable technology, places an unprecedented burden on conventional 4G LTE and public Wi-Fi networks, which may not be able to keep pace with the growing demand.

NSF established the PAWR program to foster use-inspired, fundamental research and development that will move beyond current 4G LTE and Wi-Fi capabilities and enable future advanced wireless networks. Through experimental research platforms that are at the scale of small cities and communities and designed by the U.S. academic and industry wireless research community, PAWR will explore robust new wireless devices, communication techniques, networks, systems and services that will revolutionize the nation’s wireless systems. These platforms aim to support fundamental research that will enhance broadband connectivity and sustain U.S. leadership and economic competitiveness in the telecommunications sector for many years to come.

“Leading the PAWR Project Office is a key component of US Ignite’s mission to help build the networking foundation for smart communities,” said William Wallace, executive director of US Ignite, Inc., a public-private partnership that aims to support ultra-high-speed, next-generation applications for public benefit. “This effort will help develop the advanced wireless networks needed to enable smart and connected communities to transform city services.”

Establishing the PPO with this initial award is the first step in launching a long-term, public-private partnership to support PAWR. Over the next seven years, PAWR will take shape through two multi-stage phases:

  • Design and Development. The PPO will assume responsibility for soliciting and vetting proposals to identify the platforms for advanced wireless research and work closely with sub-awardee organizations to plan the design, development, deployment and initial operations of each platform.
  • Deployment and Initial Operations. The PPO will establish and manage each platform and document best practices as it progresses through the lifecycle.

“We are delighted that our team of wireless networking researchers has been selected to take the lead of the PAWR Project Office in partnership with US Ignite, Inc.,” said Dr. Nadine Aubry, dean of the college of engineering and university distinguished professor at Northeastern University. “I believe that PAWR, by bringing together academia, industry, government and communities, has the potential to make a transformative impact through advances spanning fundamental research and field platforms in actual cities.”

The PPO will work closely with NSF, industry partners and the wireless research community in all aspects of PAWR planning, implementation and management. Over the next seven years, NSF anticipates investing $50 million in PAWR, combined with approximately $50 million in cash and in-kind contributions from over 25 companies and industry associations. The PPO will disperse these investments to support the selected platforms.

Additional information can be found on the PPO webpage.

This announcement will also be highlighted this week during the panel discussion, “Wireless Network Innovation: Smart City Foundation,” at the South by Southwest conference in Austin, Texas.

UgCS photogrammetry technique for UAV land surveying missions

 

Figure 15: Adding 40m overshot to both ends of each survey line

UgCS is easy-to-use software for planning and flying UAV drone-survey missions. It supports almost any UAV platform, providing convenient tools for areal and linear surveys and enabling direct drone control. What’s more, UgCS enables professional land survey mission planning using photogrammetry techniques.

How to plan photogrammetry mission with UgCS

Standard land surveying photogrammetry mission planning with UgCS can be divided into following steps :

  1. Obtain input data
  2. Plan mission
  3. Deploy ground control points
  4. Fly mission
  5. Image geotagging
  6. Data processing
  7. Map import to UgCS (optional)

Step one: Obtain input data

Firstly, to reach the desired result, input settings have to be defined:

  • Required GSD (ground sampling distance – size of single pixel on ground),
  • Survey area boundaries,
  • Required forward and side overlap.

GSD and area boundaries are usually defined by the customer’s requirements for output material parameters, for example by scale and resolution of digital map. Overlap should be chosen according to specific conditions of surveying area and requirements of data processing software.

Each data processing software (e.g., Pix4D, Agisoft Photoscan, Dronedeploy, Acute 3d) has specific requirements for side and forward overlaps for different surfaces. To choose correct values, please refer to documentation of chosen software. In general, 75% forward and 60% side overlap will be a good choice. Overlapping should be increased for areas with small amount of visual cues, for example for deserts or forests.

Often, aerial photogrammetry beginners are excited about the option to produce digital maps with extremely high resolution (1-2cm/pixel), and to use very small GSD for mission planning. This is very bad practice. Small GSD will result in longer flight time, hundreds of photos for each acre, tens of hours of processing and heavy output files. GSD should be set according to the output requirements of the digital map.

Other limitations can occur. For example, GSD of 10cm/pixel is required, but designed to use a Sony A6000 camera. Based on mentioned GSD and camera’s parameters, the flight altitude would be set to 510 meters. In most countries, maximum allowed altitude of UAV’s (without special permission) is limited to 120m/400ft AGL (above ground). Taking into account the maximum allowed altitude, the maximum possible GSD in this case could be no more than 2.3cm.

Step two: Plan your mission

Mission planning consists of two stages:

  • Initial planning,
  • Route optimisation.

-Initial planning:

The first step is to set surveying area using UgCS Photogrammetry tool. Area can be set using visual cues on underlying map or using exact coordinates of edges. The result – survey area is marked with yellow boundaries (Figure 1).

Figure 1: Setting the survey area

The next step is to set GSD and overlapping for the camera in Photogrammetry tool’s settings window (Figure 2).

Figure 2: Setting camera’s Ground Sampling Distance and overlapping

To take photos in Photogrammetry tool’s setting window, define the control action of the camera (Figure 3). Set camera by distance triggering action with default values.

Figure 3: Setting camera’s control action

At this point, initial route planning is completed. UgCS will automatically calculate photogrammetry route (see Figure 4).

Figure 4: Calculated photogrammetry survey route before optimisation

-Route optimisation

To optimise the route, it’s calculated parameters should be known: altitude, estimated flight time, number of shots, etc.

Part of the route’s calculated information can be found in the elevation profile window. To access the elevation profile window (if it is not visible on screen) click the parameters icon on the route card (lower-right corner, see Figure 5), and from the drop-down menu select show elevation:

Figure 5: Accessing elevation window from Route cards Parameters settings

The elevation profile window will present an estimated route length, duration, waypoint count and min/max altitude data:

Figure 6: Route values in elevation profile window

To get other calculated values, open route log by clicking on route status indicator: the green check-mark (upper-right corner, see Figure 7) of the route card:

Figure 7: Route card and status indicator, Route log

Using route parameters, it can be optimised to be more efficient and safe.

-Survey line direction

By default, UgCS will trace survey lines from south to north. But, in most cases, it will be more optimal to fly parallel to the longest boundary line of the survey area. To change survey line direction, edit direction angle field in the photogrammetry tool. In the example, by changing angle to 135 degrees, the number of passes is reduced from five (Figure 4) to four (Figure 8) and route length is 1km instead of 1.3km.

Figure 8: Changed survey line angle to be parallel to longest boundary

-Altitude type

UgCS Photogrammetry tool has the option to define how to trace the route according to altitude, with constant altitude above ground (AGL) or above mean sea level (AMSL). Please refer to your data processing software requirements as to which altitude tracking method it recommend.

In the UgCS team’s experience, the choice of altitude type depends on desired result. For orthophotomap (standard aerial land survey output format) it is better to choose AGL to ensure constant GSD for the entire map. If the aim is to produce DEM or 3D reconstruction, use AMSL so the data processing software has more data to correctly determine ground elevation by photos in order to provide more qualitative output.

Figure 9: Elevation profile with constant altitude above mean sea level (AMSL)

In this case, UgCS will calculate flight altitude based on the lowest point of the survey area.

If AGL is selected in photogrammetry tool’s settings, UgCS will calculate the altitude for each waypoint. But in this case, terrain following will be rough if no “additional waypoints” are added (see Figure 10).

Figure 10: Elevation profile with AGL without additional waypoints

Therefore, if AGL is used, add some “additional waypoints” flags and UgCS will calculate a flight plan with elevation profile accordingly (see Figure 11).

Figure 11: Elevation profile with AGL with additional waypoints

-Speed

In general, if flight speed is increased it will minimise flight time. But high speed in combination with large camera exposure can result in blurred images. In most cases 10m/s is the best choice.

-Camera control method

UgCS supports 3 camera control methods (actions):

  1. Make a shot (trigger camera) in waypoint,
  2. Make shot every N seconds,
  3. Make shot every N meters.

Not all autopilots support all 3 camera control options. For example (quite old) DJI A2 does support all three options, but newer (starting from Phantom 3 and up to M600) cameras support only triggering in waypoints and by time. DJI promised to implement triggering by distance, but it’s not available yet.

Here are some benefits and drawbacks for all three methods:

Table 1: Benefits and Drawback for camera triggering methods

In conclusion:

  • Trigger in waypoints should be preferred when possible
  • Trigger by time should be used only if no other method is possible
  • Trigger by distance should be used when triggering in waypoints is not possible to use

To select triggering method in UgCS Photogrammetry tool accordingly, use one of three available icons:

  • Set camera mode
  • Set camera by time
  • Set camera by distance

-Glibal control

Drones, e.g., DJI Phantom 3, Phantom 4, Inspire, M100 or M600 with integrated gimbal, have the option to control camera position as part of an automatic route plan.

It is advisable to set camera to nadir position in the first waypoint, and in horizontal position before landing to prevent lenses from potential damage.

To set camera position, select the waypoint preceding the photogrammetry area and click set camera attitude/zoom (Figure 12) and enter “90” in the “Tilt” field (Figure 13).

Figure 12: Setting camera attitude
Figure 13: Setting camera position

As described previously, this waypoint should be a Stop&Turn type, otherwise the drone could skip this action.

To set camera to horizontal position, select last waypoint of survey route and click set camera attitude/zoom and enter “0” in the “Tilt” field.

-Turn types

Most autopilots or multirotor drones support different turn types in waypoints. Most popular DJI drones have three turn-types:

  • Stop and Turn: drone flies to the fixed point accurately, stays at that fixed point and then flies to next fixed point.
  • Bank Turn: the drone would fly with constant speed from one point to another without stopping.
  • Adaptive Bank Turn: It is almost the same performance like Bank Turn mode (Figure 13), but the real flight routine will be more accurately than Bank Turn.

It is advisable not to use Bank Turn for photogrammetry missions. Drone interprets Bank Turns as “recommendation destination waypoint”. The drone will fly towards this direction but will almost never pass through the waypoint. Because drone will not pass the waypoint, no action will be executed, meaning the camera will not be triggered, etc.

Adaptive Bank Turn should be used with caution because a drone can miss waypoints and, again, no camera triggering will be initiated.

Figure 14: Illustration of typical DJI drone trajectories for Bank Turn and Adaptive Bank Turn types

Sometimes, adaptive bank turn type has to be used in order to have shorter flight time compared to stop and turn. When using adaptive bank turns, it is recommended to use overshot (see below) for the photogrammetry area.

-Overshot

Initially overshot was implemented for fixed wing (airplane) drones in order to have enough space for manoeuvring a U-turn.

Overshot can be set in photogrammetry tool to add an extra segment to both ends of each survey line.

Figure 15: Adding 40m overshot to both ends of each survey line

In the example (Figure 15) can be seen that UgCS added 40m additional segments to both ends of each survey line (comparing to Figure 8).

Adding overshot is useful for copter-UAVs in two situations:

  1. When Adaptive Bank Turns are used (or similar method for non-DJI drones), adding overshot will increase the chance that drone will precisely enter survey line and camera control action will be triggered. UgCS Team recommends to specify overshot that is approximately equal to distance between the parallel survey lines.
  2. When Stop and Turn type is in use in combination with action to trigger camera in waypoints, there is a possibility that before making the shot, drone will start rotation to next waypoint – it can result in having photos with wrong orientation or blurred. To avoid that, shorter overshot has to be set, for example 5m. Don’t specify too short value (< 3m) because some drones could ignore waypoints, that are too close.
Figure 16: Example of blurred image taken by drone in rotation to next waypoint

-Takeoff point

It is important to check the takeoff area at site before flying any mission! To better explain best practice on how to set takeoff point, first discuss an example of how it should not be done. Supposing that the takeoff point in our example mission (Figure 17) would be from the point marked with the airplane-icon, and drone pilot would upload the route on the ground with set automatic mission for automatic take-off.

Figure 17: Take-off point example

Most drones in automatic takeoff mode would climb to low altitude about 3-10meters and then fly straight towards the first waypoint. Other drones would fly towards first waypoint straight from ground. Looking closely at the example map (Figure 17), some trees between the takeoff point and the first waypoint can be noticed. In this example, the drone more likely will not reach a safe altitude and will hit the trees.

Not only the surroundings can affect takeoff planning. Drone manufacturers can change drones elevation behavior in drone firmware, therefore after firmware updates it is recommended that you check drones automatic takeoff mode.

Also, a very important consideration is that most small UAVs use relative altitude for mission planing. Altitude counted relatively according to first waypoint is a second reason why an actual takeoff point should be near the first waypoint, and on the same terrain level.

UgCS Team recommends placing the first waypoint as close as possible to actual takeoff point and specifying a safe takeoff altitude (≈30m in most situations will be above any trees, see Figure 18). This is the only method that warrants safe takeoff for any mission. It also protects from any weird drone behaviour and unpredictable firmware updates, etc.

Figure 18: Route with safe take-off

-Entry point to the survey grid

In the previous example, (see Figure 18), it can be noticed, that after adding the takeoff point, the route’s survey grid entry point was changed. This is because if additional waypoint is added subsequently to the photogrammetry area, UgCS will plan to fly the survey grid starting from nearest corner to the previous waypoint.

To change the entry point to survey grid, set additional waypoint close to the desired starting corner (see Figure 19).

Figure 19: Changing survey grid entry point by adding additional waypoint

-Landing point

If no landing point will be added outside the photogrammetry area after the survey mission, the drone will fly and hover in the last waypoint. There are two options for landing:

  1. Take manual control over the drone and fly to landing point manually,
  2. Activate the Return Home command in UgCS or from Remote Controller (RC).

In situations when the radio link with the drone is lost, for example if the survey area is large or there are problems with the remote controller, depending on the drone and it’s settings, one of these actions can occur:

  • Drone will return to home location automatically if lost radio link with ground station,
  • Drone will fly to last waypoint of survey area and hover as long as battery capacity will enable that, then: drone will perform emergency landing, or it will try to fly to home location.

The recommendation is to add an explicit landing points to the route in order to avoid relying on unpredictable drone behavior or settings.

If the drone doesn’t support automatic landing, or the pilot prefers to land manually, place the route’s last waypoint over the planned landing point with an altitude for comfortable manual drone descending and landing above any obstacles in the surrounding area. In general 30m is best choice.

-Action execution

Photogrammetry tool has a magic parameter “Action Execution” with three possible values:

  • Every point
  • At start
  • Forward passes

This parameter defines how and where camera actions specified for photogrammetry tool will be executed.

The most useful option for photogrammetry/survey missions is to set forward passes, the drone will make photos only on survey lines, but will not make excess photos on perpendicular lines.

-Complex survey areas

UgCS enables photogrammetry/survey mission planning for irregular areas, having functionality to combine any number of photogrammetry areas in one route, avoiding splitting the area in separate routes.

For example, if a mission has to be planned for two fields connected in a T-shape, and if these two fields are marked as one photogrammetry area, the whole route will not be optimal, regardless any direction of survey lines.

Figure 20: Complex survey area before optimisation

If the survey area is marked as two photogrammetry areas within one route, survey lines for each area can be optimised individually (see Figure 21).

Figure 21: Optimised survey flight passes for each part of a complex photogrammetry area

Step three: deploy ground control points

Ground control points are mandatory if the survey output map has to be precisely aligned to coordinates on Earth.

There are lots of discussions about the necessity of ground control points in cases when a drone is equipped with Real Time Kinematics (RTK) GPS receivers with centimeter-level accuracy. This is useful, but the drone coordinates are not in themselves sufficient because, for precise map aligning, image center coordinates are necessary.

Data processing softwares like Agisoft Photoscan, Dronedeplay, Pix4d, Icarus OneButton and others will produce very accurate maps using geotagged images, but the real precision of the map will not be known without ground control points.

Conclusion: ground control points have to be used to create survey-grade result. For a map with approximate precision, it is sufficient to rely just on RTK GPS and the capabilities of data processing software.

Step four: fly your mission

For carefully planned missions, flying it is the most straightforward step. Mission execution differs according to the type of UAV and equipment used, therefore it will not be described in detail in this article (please refer to equipment’s and UgCS documentation).

Important issues before flying:

  • In most countries there are strict regulations for UAV usage. Always comply with the regulations! Usually these rules can be found on web-site of local aviation authority.
  • In some countries special permission for any kind of aerial photo/video shooting is needed. Please check local regulations.
  • In most cases missions are planned before arriving to flying location (e.g., in office, at home) using satellite imaginary from Google maps, Bing, etc. Before flying always check actual circumstances at the location. There could be a need to adjust take-off/landing points, for example, to avoid tall obstacles (e.g., trees, masts, power lines) in your survey area.

Step five: image geotagging

Image geotagging is optional if ground control points were used, but almost any data processing software will require less time to process geotagged images.

Some of the latest and professional drones with integrated cameras can geotag images automatically during flight. In other cases images can be geotagged in UgCS after flight.

Very important: UgCS uses the telemetry log from drone, that is received via radio channel, to extract the drone’s altitude for any given moment (when pictures were taken). To geotag pictures using UgCS, assure robust telemetry reception during flight.

For detailed information how to geotag images using UgCS refer to UgCS User Manual.

Step six: data processing

For data processing, use third party software or services available on the market.

From UgCS Team experience, the most powerful and flexible software is Agisoft Photoscan (http://www.agisoft.com/), but sometimes too much user input is required to get necessary results. The most uncomplicated solution for users is online service Dronedeploy (https://www.dronedeploy.com/). All other software packages and services will fit somewhere between these two in terms of complexity and power.

Step seven (optional): import created map to UgCS

Should the need arise for the mission to be repeated in the future, UgCS enables importing the GeoTiff file as a map layer and using it for mission planning. More detailed instructions can be found in UgCS User Manual. See the result of an imported map created using UgCS photogrammetry tool imported as GeoTiff file in Figure 22.

Figure 22: Imported GeoTiff map as layer. The map is output of a Photogrammetry survey mission panned with UgCS

Visit the UgCS homepage

Download this tutorial as a PDF

If you liked this tutorial, you may also enjoy these:

See all the latest robotics news on Robohub, or sign up for our weekly newsletter.

Reactions from experts: Robotics and tech to receive funding boost from UK government

Yesterday, the UK government announced their budget plans to invest in robotics, artificial intelligence, driverless cars, and faster broadband. The spending commitments include:

  • £16m to create a 5G hub to trial the forthcoming mobile data technology. In particular, the government wants there to better mobile network coverage over the country’s roads and railway lines
  • £200m to support local “full-fibre” broadband network projects that are designed to bring in further private sector investment
  • £270m towards disruptive technologies to put the UK “at the forefront” including cutting-edge artificial intelligence and robotics systems that will operate in extreme and hazardous environments, including off-shore energy, nuclear energy, space and deep mining; batteries for the next generation of electric vehicles; and biotech.
  • Investing £300 million to further develop the UK’s research talent, including through creating an additional 1,000 PhD places.

The entire budget can be reviewed here.

Several experts in the robotics community agree that progress is shifting in the right direction, however, more needs to happen if the UK is to remain competitive in the robotics sector:

Prof Paul Newman, Founder, Oxbotica:

“The UK understand the very real positive impact that RAS [robotics & autonomous systems] will have on our society from now, of all time. It continues to see the big picture and today’s announcement by the Chancellor is a clear indication of that. We can have better roads, cleaner cities, healthier oceans and bodies, safer skies, deeper mines, better jobs and more opportunity. That’s what machines are for.”

Dr Graeme Smith, CEO, Oxbotica:

“We are at a real inflection point in the development of autonomous technology. The UK has a number of nascent world class companies in the area of self-driving vehicles, which have a huge potential to change the world, whilst creating jobs and producing exportable UK goods and services. We have a head start and now we need to take advantage of it.” [from FT]

Dominic Keen, Founder of Britbots:

“Some of the great robotics companies of the future are being launched by British entrepreneurs and the support announced in today’s budget will to strengthen their impact and global competitiveness.  We’re currently seeing strong appetite from private investors to back locally-grown robotics businesses and this money will help bring even more interest in this space”

Dr Rob Buckingham, Director of the UK Atomic Energy Authority’s RACE robotics centre:

“This is welcome news for the many research organisations developing robotics applications. As a leading UK robotics research group specialising in extreme and challenging environments, we welcome the allocation of significant funding in this field as part of the Government’s evolving Industrial Strategy. RACE and the rest of the robotics R&D sector are looking forward to working with industry to fully utilise this funding.”

Dr Sabine Hauert, University of Bristol:

“Robotics and AI is set to be a driving force in increasing productivity, but also in solving societal and environmental challenges. It’s opening new frontiers in off-shore and nuclear energy, space and deep mining. Investment from government will be key in helping the UK stay at the forefront of this field.” [from BBC]

Prof Noel Sharkey, University of Sheffield:

“We lost our best machine learning group to Amazon just recently. The money means there will be more resources for universities, which may help them retain their staff. But it’s not nearly enough for all of the disruptive technologies being developed in the UK. The government says it want this to be the leading robotics country in the world, but Google and others are spending far more, so it’s ultimately chicken feed by comparison.” [from BBC]

Prof Alan Winfield, UWE Bristol:

“I’m pleased by the additional funding, and, in fact, my group is a partner in a new £4.6M EPSRC grant to develop robots for nuclear decommissioning announced last week.

But having just returned from Tokyo (from AI in Asia: AI for Social Good), I’m well aware that other countries are investing much more heavily than the UK. China was for instance described as an emerging powerhouse of AI. A number of colleagues at that meeting also made the same point as Noel, that universities are haemorrhaging star AI/robotics academics to multi-national companies with very deep pockets.”

Michael Szollosy, Research Fellow at Sheffield Centre for Robotics, Dept of Psychology:

“I, like many others, was pleased to hear more money going into robotics and AI research, but I was disappointed – though completely unsurprised – to see nothing about how to restructure the economy to deal with the consequences of increasing research into and use of robots and AI. Hammond’s blunder on the relationship of productivity to wages – and it can’t be seen as anything other than a blunder – means that he doesn’t even seem to appreciate that there is a problem.

The truth is that increased automation means fewer jobs and lower wages and this needs to be addressed with some concrete measures. There will be benefits to society with increased automation, but we need to start thinking now (and taking action now) to ensure that those benefits aren’t solely economic gain for the already-wealthy. The ‘robot dividend’ needs to be shared across society, as it can have far-reaching consequences beyond economics: improving our quality of life, our standard of living, education, health and accessibility.”

Frank Tobe, Editor at The Robot Report:

“America has the American Manufacturing Initiative which, in 2015, was expanded to establish Fraunhofer-like research facilities around the US (on university campuses) that focus on particular aspects of the science of manufacturing.

Robotics were given $50 million of the $500 million for the initiative and one of the research facilities was to focus on robotics. Under the initiative, efforts from the SBIR, NSF, NASA and DoD/DARPA were to be coordinated in their disbursement of fundings for science in robotics. None of these fundings comes anywhere close to the coordinated funding programs and P-P-Ps found in the EU, Korea and Japan, nor the top-down incentivized directives of China’s 5-year plans. Essentially American robotic funding is (and has been) predominantly entrepreneurial with token support from the government.

In the new Trump Administration, there is no indication of any direction nor continuation (funding) of what little existing programs we have. At a NY Times editorial board sit-down with Trump after his election, he was quoted as saying that “Robotics is becoming very big and we’re going to do that. We’re going to have more factories. We can’t lose 70,000 factories. Just can’t do it. We’re going to start making things.” Thus far there is no followup to those statements nor has Trump hired replacements for the top executives at the Office of Science and Technology Policy, all of which are presently vacant.”

And finally, a few comments from the business sector on Twitter:

 

Should an artificial intelligence be allowed to get a patent?

Whether an A.I. ought to be granted patent rights is a timely question given the increasing proliferation of A.I. in the workplace. Examples: Daimler-Benz has tested self-driving trucks on public roads[1], A.I. technology has been applied effectively in medical advancements, psycholinguistics, tourism and food preparation,[2] a film written by an A.I. recently debuted online[3] and A.I. has even found its way into the legal profession,[4] and current interest in the question of whether an A.I. can enjoy copyright rights with several articles having already being published on the subject of A.I. and copyright rights.[5]

In 2014 the U.S. Copyright Office updated its Compendium of U.S. Copyright Office Practices with, inter alia, a declaration that  the Office will not register works produced by a machine or mere mechanical process that operates randomly or automatically without any creative input or intervention from a human author.”[6]

To grant or not to grant: A human prerequisite?

One might argue that Intellectual Property (IP) laws and IP Rights were designed to exclusively benefit human creators and inventors[7] and thus would exclude non-humans from holding IP rights. The U.S. Copyright Office’s December 2014 update to the Compendium of U.S. Copyright Office Practices that added requirements for human authorship[8] certainly adds weight to this view.

However, many IP laws were drafted well before the emergence of A.I. and in any case, do not explicitly require that a creator or inventor be ‘human.’ The World Intellectual Property Organization’s (WIPOs) definition of Intellectual Property talks about creations of the mind[9] but does not specify whether it must be a human mind. Similarly, provisions in laws promoting innovation and IP rights, such as the so-called Intellectual Property Clause of the U.S. Constitution[10], also do not explicitly mention a ‘human’ requirement. Finally, it ought to be noted that while the U.S. Copyright Office declared it would not register works produced by a machine or mere mechanical process without human creative input, it did not explicitly state that an A.I. could not have copyright rights.[11]

Legal personhood

One might also argue that an A.I. is not human, and is therefore not a legal person and, thus, is not entitled to apply for much less be granted a patent. New Zealand’s Patents Act, for example, refers to a patent ‘applicant’ as a ‘person’.[12]

Yet this line of argument could be countered by an assertion that a legal ‘person’ need not be ‘human’ as is the case of a corporation and there are many examples of patents assigned to corporations.[13]

The underlying science

To answer the question of patent rights for an A.I. we need to examine how modern A.I. systems work and, as an example, consider how machine translation applications such as Google Translate function.

While such systems are marketed as if they’re “magic brains that just understand language”,[14]the problem is that there is currently no definitive scientific description for language[15] or language processing[16]. Thus, such language translation systems cannot function by mimicking the processes of the brain.

Rather, they employ a scheme known as Statistical Machine Translation (SMT) whereby online systems search the Internet identifying documents that have already been translated by human translators– for example books, and organizations like the United Nations, or websites. The system scans these texts for statistically significant patterns and once the computer finds a pattern it uses the pattern to translate similar text in the future.[17] This, as Jaron Lanier and others note, means that the people who created the translations and make translation systems possible are not paid to for their contributions[18].

Many modern A.I. systems are largely big data models that operate by defining a real world problem that needs to be solved, then conceiving a conceptual model to solve this problem which is typically a statistical analysis that falls into one of three categories: regression, classification or missing data. Data is then fed into the model and used to refine and calibrate the model. As the model is increasingly refined it is used to guide the location of data and, after a number of rounds of refinement finally results in a model capable of some predictive functionality.[19]

Big data models can be used to discover patterns in large data sets[20] but also can, as in the case of translation systems, exploit statistically significant correlations in data.

None of this, however, suggests that current A.I. systems are capable of inventive or creative capacity.

Patentability?

So to get a patent, an invention must:

  • Be novel in that it does not form part of the prior art[21]
  • Have an inventive step in that it not obvious to a person skilled in the art[22]
  • Be useful[23]
  • it must not fall into an excluded category that can include discoveries, presentations of information and mental processes or rules or methods for performing a mental act.[24]

Why discoveries are not inventions is tied with the issue of obviousness and as noted by Buckley J. in Reynolds v. Herbert Smith & Co., Ltd[25] who stated:

“Discovery adds to the amount of human knowledge, but it does so only by lifting the veil and disclosing something which before had been unseen or dimly seen. Invention also adds to human knowledge, but not merely by disclosing something. Invention necessarily involves also the suggestion of an act to be done, and it must be an act which results in a new product, or a new result, or a new process, or a new combination for producing an old product or an old result.”

Therefore in order to get a patent, an A.I. must first be capable of producing a patentable invention but, given current technology, is this even possible?

A thought exercise

Consider the following:

You believe that as a person exercises more, he/she consumes more oxygen and have tasked your A.I. with analyzing the relationship between oxygen consumption and exercise.

You provide the A.I. with a model suggesting that oxygen consumption increases with physical exertion and data that shows oxygen consumption among people performing little, moderate (e.g. walking briskly) and heavy exercise (e.g. running).

The A.I. reviews the data, refines the model, collects more data and comes up with a predictive model (e.g. when a person exercises X amount, he/she consumes Y amount of oxygen and when the person doubles his/her exertion, his oxygen consumption rate triples).

As this is essentially a statistical regression, the model will not always completely accurate in its predictions due to differences between individuals (i.e. for some persons the model will predict oxygen consumption fairly accurately, for others its results will be far off).

However, this particular model has another, more fundamental limitation – it fails to consider that a human cannot exercise beyond a certain point because his/her heart would be incapable of sustaining such levels of exertion[26] or because over-exercise may trigger an unexpected reaction (e.g. death).[27]

If one were to feed this model data of persons who have collapsed or died during exercise (and thus, in the latter case, not consume any oxygen), would the A.I. be able to ‘think outside its box’ and:

  • Question the cause of these data discrepancies and have the initiative to conduct further investigation?
  • Note and correct the limitation in the original model (which would require a significant amendment)?

Or would it simply alter the existing model by changing the slope of the regression line?

SMT and other A.I. have similar limitations, in the case of SMT, once the system is built, linguistic knowledge becomes necessary to achieve perfect translation at all grammatical levels[28] and SMT systems presently cannot translate cultural components of the source text into the target language, provide very literal, word for word translations that do not recognize idioms, slang, and terms that are not in the machine’s memory and lack human creativity[29] To do so would require a change to the underlying machine translation model, and the question arises whether this would have to be done by the human creators of the SMT or whether the SMT itself would be able to make the necessary corrections and adjustments to the model.

Should the SMT or, in the earlier example the A.I., be unable to improve and, in this case, innovate on the existing model does it have the creative or inventive capacity to conceive an invention is truly inventive? And if either the SMT or the AI can produce something that appears novel and inventive, given the nature of how A.I. presently operates (i.e. as big data models), would such a product be the result of an analysis of existing data to uncover hitherto unseen relationships – in other words, a discovery?

Returning to the original question about patent rights for an A.I., perhaps the question we should ask is not whether an A.I. should be able to get a patent, but whether an A.I., given current technology, can create a patentable invention in the first place and if the answer to that question is ‘no’, then the question of granting patent rights to an A.I. is moot.

References:

[1]  Jon Fingas, ‘Daimler tests a self-driving, mass-produced truck on real roads’, Engadget, Oct. 4, 2016.

[2] Sophie Curtis, ‘Cognitive Cooking: How Is A.I. Changing Foodtech?’ RE•WORK, April 19 2016.

[3] Analee Newitz, ‘Movie written by algorithm turns out to be hilarious and intense’ ArsTechnica, June 9, 2016.

[4]  Susan Beck, ‘AI Pioneer ROSS Intelligence Lands Its First Big Law Clients, The American Lawyer, May 6, 2016.

[5] See for example:

[6] Copyright Office, Compendium of U.S. Copyright Office Practices (3d ed. 2014). § 313.2

[7] Hettinger argued that ‘the most powerful intuition supporting property rights is that people are entitled to the fruits of their labor’

  • See: Edwin Hettinger, “Justifying Intellectual Property’, Philosophy & Public Affairs, Vol. 18, No. 1, Winder 1989, p. 31-52

[8] Copyright Office, Compendium of U.S. Copyright Office Practices (3d ed. 2014). § 306.

[9] According to WIPO‘Intellectual property (IP) refers to creations of the mind, such as inventions; literary and artistic works; designs; and symbols, names and images used in commerce.’

[10] Article I, Section 8, Clause 8 of the United States Constitution states: The Congress shall have Power To…. promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.

[11] According to the U.S. Copyright Office, ‘Copyright exists from the moment the work is created. You will have to register, however, if you wish to bring a lawsuit for infringement of a U.S. work’

See ‘Copyright in General’,

[12] s.5(1) Patents Act 2013

[13] See for example, U.S. Patent 5,953,441 ‘Fingerprint sensor having spoof reduction features and related methods’ which was assigned to Harris Corporation.

[14] Nilagia McCoy  ‘Jaron Lanier: The Digital Economy Since Who Owns The Future?’ October 8, 2015.

[15] This was noted by Jaron Lanier who delivered the keynote address at the opening of the Conference on the Global Digital Content Market taking place from April 20-22, 2016 at WIPO Headquarters in Geneva, Switzerland.

[16] See for example:

[17] Inside Google Translate, July 9, 2010.

[18] Catherine Jewell, ‘Digital pioneer, Jaron Lanier, on the dangers of “free” online culture,’ WIPO Magazine, April 2016.

[19] Noah Silverman.

[20] i.e. ‘data mining’.

[21] s 6, New Zealand Patents Act 2013

[22] s 7, New Zealand Patents Act 2013

[23] s 10, New Zealand Patents Act 2013

[24] See for example:. 93(2) Hong Kong Patents Ordinance, Cap 514 or the Canadian Patent Act R.S.C., 1985, c. P-4

[25] (1903), 20 R.P.C. 123

[26] Wikipedia, ‘Heart Rate’.

[27] This may be caused by a congenital condition but some also believe that prolonged excessive exercise may exacerbate matters. See:

[28] Mireia Farrus, Marta R. Costa-Jussa, Jose B. Marino, Marc Poch, Adolfo Hernandez, Carlos Henrıque,  Jose A. R. Fonollosa, ‘Overcoming statistical machine translation limitations: error analysis and proposed solutions for the Catalan–Spanish language pair’ Language Resources & Evaluation, DOI 10.1007/s10579-011-9137-0.

[29] ‘Machine Translation vs. Human Translators’ 

See also:

Supporting Women in Robotics on International Women’s Day.. and beyond.

International Women’s Day is raising discussion about the lack of diversity and role models in STEM and the potential negative outcomes of bias and stereotyping in robotics and AI. Let’s balance the words with positive actions. Here’s what we can all do to support women in robotics and AI, and thus improve diversity, innovation and reduce skills shortages for robotics and AI.

Join WomeninRobotics.org – a network of women working in robotics (or who aspire to work in robotics). We are a global discussion group supporting local events that bring women together for peer networking. We recognize that lack of support and mentorship in the workplace holds women back, particularly if there is only one woman in an organization/company.

Although the main group is only for women, we are going to start something for male ‘Allies’ or ‘Champions’. So men, you can join women in robotics too! Women need champions and while it would be ideal to have an equal number of women in leadership roles, until then, companies can improve their hiring and retention by having visible and vocal male allies. We all need mentors as our careers progress.

Women also need visibility and high profile projects for their careers to progress on par. One way of improving that is to showcase the achievements of women in robotics. Read and share all four year’s worth of our annual “25 Women in Robotics you need to know about” – that’s more than 100 women already because we have some groups in there. (There has always been a lot of women on the core team at Robohub.org, so we love showing our support.) Our next edition will come out on October 10 2017 to celebrate Ada Lovelace Day.

Change starts at the top of an organization. It’s very hard to hire women if you don’t have any women, or if they can’t see pathways for advancement in your organization. However, there are many things you can do to improve your hiring practices. Some are surprisingly simple, yet effective. I’ve collected a list and posted it at Silicon Valley Robotics – How to hire women.

And you can invest in women entrepreneurs. All the studies show that you get a higher rate of return, and higher likelihood of success from investments in female founders. And yet, proportionately investment is much less. You don’t need to be a VC to invest in women either. Kiva.org is matching loans today and $25 can empower an entrepreneur all over the world. #InvestInHer

And our next Silicon Valley/ San Francisco Women in Robotics event will be on March 22 at SoftBank Robotics – we’d love to see you there – or in support!

Back to the core of intelligence … to really move to the future

Guest post by José Hernández-Orallo, Professor at Technical University of Valencia

Two decades ago I started working on metrics of machine intelligence. By that time, during the glacial days of the second AI winter, few were really interested in measuring something that AI lacked completely. And very few, such as David L. Dowe and I, were interested in metrics of intelligence linked to algorithmic information theory, where the models of interaction between an agent and the world were sequences of bits, and intelligence was formulated using Solomonoff’s and Wallace’s theories of inductive inference.

In the meantime, seemingly dozens of variants of the Turing test were proposed every year, the CAPTCHAs were introduced and David showed how easy it is to solve some IQ tests using a very simple program based on a big-switch approach. And, today, a new AI spring has arrived, triggered by a blossoming machine learning field, bringing a more experimental approach to AI with an increasing number of AI benchmarks and competitions (see a previous entry in this blog for a survey).

Considering this 20-year perspective, last year was special in many ways. The first in a series of workshops on evaluating general-purpose AI took off, echoing the increasing interest in the assessment of artificial general intelligence (AGI) systems, capable of finding diverse solutions for a range of tasks. Evaluating these systems is different, and more challenging, than the traditional task-oriented evaluation of specific systems, such as a robotic cleaner, a credit scoring model, a machine translator or a self-driving car. The idea of evaluating general-purpose AI systems using videogames had caught on. The arcade learning environment (the Atari 2600 games) or the more flexible Video Game Definition Language and associated competition became increasingly popular for the evaluation of AGI and its recent breakthroughs.

Last year also witnessed the introduction of a different kind of AI evaluation platforms, such as Microsoft’s Malmö, GoodAI’s School, OpenAI’s Gym and Universe, DeepMind’s Lab, Facebook’s TorchCraft and CommAI-env. Based on a reinforcement learning (RL) setting, these platforms make it possible to create many different tasks and connect RL agents through a standard interface. Many of these platforms are well suited for the new paradigms in AI, such as deep reinforcement learning and some open-source machine learning libraries. After thousands of episodes or millions of steps against a new task, these systems are able to excel, with usually better than human performance.

Despite the myriads of applications and breakthroughs that have been derived from this paradigm, there seems to be a consensus in the field that the main open problem lies in how an AI agent can reuse the representations and skills from one task to new ones, making it possible to learn a new task much faster, with a few examples, as humans do. This can be seen as a mapping problem (usually under the term transfer learning) or can be seen as a sequential problem (usually under the terms gradual, cumulative, incremental, continual or curriculum learning).

One of the key notions that is associated with this capability of a system of building new concepts and skills over previous ones is usually referred to as “compositionality”, which is well documented in humans from early childhood. Systems are able to combine the representations, concepts or skills that have been learned previously in order to solve a new problem. For instance, an agent can combine the ability of climbing up a ladder with its use as a possible way out of a room, or an agent can learn multiplication after learning addition.

In my opinion, two of the previous platforms are better suited for compositionality: Malmö and CommAI-env. Malmö has all the ingredients of a 3D game, and AI researchers can experiment and evaluate agents with vision and 3D navigation, which is what many research papers using Malmö have done so far, as this is a hot topic in AI at the moment. However, to me, the most interesting feature of Malmö is building and crafting, where agents must necessarily combine previous concepts and skills in order to create more complex things.

CommAI-env is clearly an outlier in this set of platforms. It is not a video game in 2D or 3D. Video or audio don’t have any role there. Interaction is just produced through a stream of input/output bits and rewards, which are just +1, 0 or -1. Basically, actions and observations are binary. The rationale behind CommAI-env is to give prominence to communication skills, but it still allows for rich interaction, patterns and tasks, while “keeping all further complexities to a minimum”.

Examples of interaction within the CommAI-mini environment.

When I was aware that the General AI Challenge was using CommAI-env for their warm-up round I was ecstatic. Participants could focus on RL agents without the complexities of vision and navigation. Of course, vision and navigation are very important for AI applications, but they create many extra complications if we want to understand (and evaluate) gradual learning. For instance, two equal tasks for which the texture of the walls changes can be seen as requiring higher transfer effort than two slightly different tasks with the same texture. In other words, this would be extra confounding factors that would make the analysis of task transfer and task dependencies much harder. It is then a wise choice to exclude this from the warm-up round. There will be occasions during other rounds of the challenge for including vision, navigation and other sorts of complex embodiment. Starting with a minimal interface to evaluate whether the agents are able to learn incrementally is not only a challenging but an important open problem for general AI.

Also, the warm-up round has modified CommAI-env in such a way that bits are packed into 8-bit (1 byte) characters. This makes the definition of tasks more intuitive and makes the ASCII coding transparent to the agents. Basically, the set of actions and observations is extended to 256. But interestingly, the set of observations and actions is the same, which allows many possibilities that are unusual in reinforcement learning, where these subsets are different. For instance, an agent with primitives such as “copy input to output” and other sequence transformation operators can compose them in order to solve the task. Variables, and other kinds of abstractions, play a key role.

This might give the impression that we are back to Turing machines and symbolic AI. In a way, this is the case, and much in alignment to Turing’s vision in his 1950 paper: “it is possible to teach a machine by punishments and rewards to obey orders given in some language, e.g., a symbolic language”. But in 2017 we have a range of techniques that weren’t available just a few years ago. For instance, Neural Turing Machines and other neural networks with symbolic memory can be very well suited for this problem.

By no means does this indicate that the legion of deep reinforcement learning enthusiasts cannot bring their apparatus to this warm-up round. Indeed they won’t be disappointed by this challenge if they really work hard to adapt deep learning to this problem. They won’t probably need a convolutional network tuned for visual pattern recognition, but there are many possibilities and challenges in how to make deep learning work in a setting like this, especially because the fewer examples, the better, and deep learning usually requires many examples.

As a plus, the simple, symbolic sequential interface opens the challenge to many other areas in AI, not only recurrent neural networks but techniques from natural language processing, evolutionary computation, compression-inspired algorithms or even areas such as inductive programming, with powerful string-handling primitives and its appropriateness for problems with very few examples.

I think that all of the above makes this warm-up round a unique competition. Of course, since we haven’t had anything similar in the past, we might have some surprises. It might happen that an unexpected (or even naïve) technique could behave much better than others (and humans) or perhaps we find that no technique is able to do something meaningful at this time.

I’m eager to see how this round develops and what the participants are able to integrate and invent in order to solve the sequence of micro and mini-tasks. I’m sure that we will learn a lot from this. I hope that machines will, too. And all of us will move forward to the next round!

José Hernández-Orallo is a professor at Technical University of Valencia and author of “The Measure of All Minds, Evaluating Natural and Artificial Intelligence”, Cambridge University Press, 2017.


Back to the core of intelligence … to really move to the future was originally published in AI Roadmap Institute Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Unsolved Problems in AI

Guest post by Simon Andersson, Senior Research Scientist @GoodAI

Executive summary

  • Tracking major unsolved problems in AI can keep us honest about what remains to be achieved and facilitate the creation of roadmaps towards general artificial intelligence.
  • This document currently identifies 29 open problems.
  • For each major problem, example tests are suggested for evaluating research progress.

Introduction

This document identifies open problems in AI. It seeks to provide a concise overview of the greatest challenges in the field and of the current state of the art, in line with the “open research questions” theme of focus of the AI Roadmap Institute.

The challenges are grouped into AI-complete problems, closed-domain problems, and fundamental problems in commonsense reasoning, learning, and sensorimotor ability.

I realize that this first attempt at surveying the open problems will necessarily be incomplete and welcome reader feedback.

To help accelerate the search for general artificial intelligence, GoodAI is organizing the General AI Challenge (GoodAI, 2017), that aims to solve some of the problems outlined below, through a series of milestone challenges starting in early 2017.

Sources, method, and related work

The collection of problems presented here is the result of a review of the literature in the areas of

  • Machine learning
  • Machine perception and robotics
  • Open AI problems
  • Evaluation of AI systems
  • Tests for the achievement of human-level intelligence
  • Benchmarks and competitions

To be considered for inclusion, a problem must be

  1. Highly relevant for achieving general artificial intelligence
  2. Closed in scope, not subject to open-ended extension
  3. Testable

Problems vary in scope and often overlap. Some may be contained entirely in others. The second criterion (closed scope) excludes some interesting problems such as learning all human professions; a few problems of this type are mentioned separately from the main list. To ensure that problems are testable, each is presented together with example tests.

Several websites, some listed below, provide challenge problems for AI.

In the context of evaluating AI systems, Hernández-Orallo (2016a) reviews a number of open AI problems. Lake et al. (2016) offers a critique of the current state of the art in AI and discusses problems like intuitive physics, intuitive psychology, and learning from few examples.

A number of challenge problems for AI were proposed in (Brooks, et al., 1996) and (Brachman, 2006).

The challenges

The rest of the document lists AI challenges as outlined below.

  1. AI-complete problems
  2. Closed-domain problems
  3. Commonsense reasoning
  4. Learning
  5. Sensorimotor problems

AI-complete problems

AI-complete problems are ones likely to contain all or most of human-level general artificial intelligence. A few problems in this category are listed below.

  1. Open-domain dialog
  2. Text understanding
  3. Machine translation
  4. Human intelligence and aptitude tests
  5. Coreference resolution (Winograd schemas)
  6. Compound word understanding

Open-domain dialog

Open-domain dialog is the problem of conducting competently a dialog with a human when the subject of the discussion is not known in advance. The challenge includes language understanding, dialog pragmatics, and understanding the world. Versions of the tasks include spoken and written dialog. The task can be extended to include multimodal interaction (e.g., gestural input, multimedia output). Possible success criteria are usefulness and the ability to conduct dialog indistinguishable from human dialog (“Turing test”).

Tests

Dialog systems are typically evaluated by human judges. Events where this has been done include

  1. The Loebner prize (Loebner, 2016)
  2. The Robo chat challenge (Robo chat challenge, 2014)

Text understanding

Text understanding is an unsolved problem. There has been remarkable progress in the area of question answering, but current systems still fail when common-sense world knowledge, beyond that provided in the text, is required.

Tests

  1. McCarthy (1976) provided an early text understanding challenge problem.
  2. Brachman (2006) suggested the problem of reading a textbook and solving its exercises.

Machine translation

Machine translation is AI-complete since it includes problems requiring an understanding of the world (e.g., coreference resolution, discussed below).

Tests

While translation quality can be evaluated automatically using parallel corpora, the ultimate test is human judgement of quality. Corpora such as the Corpus of Contemporary American English (Davies, 2008) contain samples of text from different genres. Translation quality can be evaluated using samples of

  1. Newspaper text
  2. Fiction
  3. Spoken language transcriptions

Intelligence tests

Human intelligence and aptitude tests (Hernández-Orallo, 2017) are interesting in that they are designed to be at the limit of human ability and to be hard or impossible to solve using memorized knowledge. Human-level performance has been reported for Raven’s progressive matrices (Lovett and Forbus, 2017) but artificial systems still lack the general reasoning abilities to deal with a variety of problems at the same time (Hernández-Orallo, 2016b).

Tests

  1. Brachman (2006) suggested using the SAT as an AI challenge problem.

Coreference resolution

The overlapping problems of coreference resolution, pronoun disambiguation, and Winograd schemas require picking out the referents of pronouns or noun phrases.

Tests

  1. Davis (2011) lists 144 Winograd schemas.
  2. Commonsense Reasoning (2016b) lists pronoun disambiguation problems: 62 sample problems and 60 problems used in the first Winograd Schema Challenge, held at IJCAI-16.

Compound word understanding

In many languages, there are compound words with set meanings. Novel compound words can be produced, and we are good at guessing their meaning. We understand that a water bird is a bird that lives near water, not a bird that contains or is constituted by water, and that schadenfreude is felt when others, not we, are hurt.

Tests

  1. The meaning of noun phrases” at (Commonsense Reasoning, 2015)

Closed-domain problems

Closed-domain problems are ones that combine important elements of intelligence but reduce the difficulty by limiting themselves to a circumscribed knowledge domain. Game playing agents are examples of this and artificial agents have achieved superhuman performance at Go (Silver et al., 2016) and more recently poker (Aupperlee, 2017; Brown and Sandholm, 2017). Among the open problems are:

  1. Learning to play board, card, and tile games from descriptions
  2. Producing programs from descriptions
  3. Source code understanding

Board, card, and tile games from descriptions

Unlike specialized game players, systems that have to learn new games from descriptions of the rules cannot rely on predesigned algorithms for specific games.

Tests

  1. The problem of learning new games from formal-language descriptions has appeared as a challenge at the AAAI conference (Genesereth et al., 2005; AAAI, 2013).
  2. Even more challenging is the problem of learning games from natural language descriptions; such descriptions for card and tile games are available from a number of websites (e.g., McLeod, 2017).

Programs from descriptions

Producing programs in a programming language such as C from natural language input is a problem of obvious practical interest.

Tests

  1. The “Description2Code” challenge proposed at (OpenAI, 2016) has 5000 descriptions for programs collected by Ethan Caballero.

Source code understanding

Related to source code production is source code understanding, where the system can interpret the semantics of code and detect situations where the code differs in non-trivial ways from the likely intention of its author. Allamanis et al. (2016) reports progress on the prediction of procedure names.

Tests

  1. The International Obfuscated C Code Contest (OCCC, 2016) publishes code that is intentionally hard to understand. Source code understanding could be tested as the ability to improve the readability of the code as scored by human judges.

Commonsense reasoning

Commonsense reasoning is likely to be a central element of general artificial intelligence. Some of the main problems in this area are listed below.

  1. Causal reasoning
  2. Counterfactual reasoning
  3. Intuitive physics
  4. Intuitive psychology

Causal reasoning

Causal reasoning requires recognizing and applying cause-effect relations.

Tests

  1. Strength of evidence” at (Commonsense Reasoning, 2015)
  2. Wolves and rabbits” at (Commonsense Reasoning, 2015)

Counterfactual reasoning

Counterfactual reasoning is required for answering hypothetical questions. It uses causal reasoning together with the system’s other modeling and reasoning capabilities to consider situations possibly different from anything that ever happened in the world.

Tests

  1. The cruel and unusual Yale shooting problem” at (Commonsense Reasoning, 2015)

Intuitive physics

A basic understanding of the physical world, including object permanence and the ability to predict likely trajectories, helps agents learn faster and make better predictions. This is now a very active research area; some recent work is reported in (Agrawal et al., 2016; Chang et al., 2016; Degrave et al., 2016; Denil et al., 2016; Finn et al., 2016; Fragkiadaki et al., 2016; Hamrick et al., 2016; Li et al., 2016; Mottaghi et al., 2016; Nair et al., 2016; Stewart and Ermon, 2016).

Tests

  1. The “Physical reasoning” section at (Commonsense Reasoning, 2015) (8 problems)
  2. The handle problem” at (Commonsense Reasoning, 2015)

Intuitive psychology

Intuitive psychology, or theory of mind, allows the agent to understand goals and beliefs and infer them from the behavior of other agents.

Tests

  1. The “Naive psychology” section at (Commonsense Reasoning, 2015) (4 problems)

Learning

Despite remarkable advances in machine learning, important learning-related problems remain mostly unsolved. They include:

  1. Gradual learning
  2. Unsupervised learning
  3. Strong generalization
  4. Category learning from few examples
  5. Learning to learn
  6. Compositional learning
  7. Learning without forgetting
  8. Transfer learning
  9. Knowing when you don’t know
  10. Learning through action

Gradual learning

Humans are capable of lifelong learning of increasingly complex tasks. Artificial agents should be, too. Versions of this idea have been discussed under the rubrics of life-long (Thrun and Mitchell, 1995), continual, and incremental learning. At GoodAI, we have adopted the term gradual learning (Rosa et al., 2016) for the long-term accumulation of knowledge and skills. It requires the combination of several abilities discussed below:

  • Compositional learning
  • Learning to learn
  • Learning without forgetting
  • Transfer learning

Tests

  1. A possible test applies to a household robot that learns household and house maintenance tasks, including obtaining tools and materials for the work. The test evaluates the agent on two criteria: Continuous operation (Nilsson in Brooks, et al., 1996) where the agent needs to function autonomously without reprogramming during its lifetime, and improving capability, where the agent must exhibit, at different points in its evolution, capabilities not present at an earlier time.

Unsupervised learning

Unsupervised learning has been described as the next big challenge in machine learning (LeCun 2016). It appears to be fundamental to human lifelong learning (supervised and reinforcement signals do not provide nearly enough data) and is closely related to prediction and common-sense reasoning (“filling in the missing parts”). A hard problem (Yoshua Bengio, in the “Brains and bits” panel at NIPS 2016) is unsupervised learning in hierarchical systems, with components learning jointly.

Tests

In addition to the possible tests in the vision domain, speech recognition also presents opportunities for unsupervised learning. While current state-of-the-art speech recognizers rely largely on supervised learning on large corpora, unsupervised recognition requires discovering, without supervision, phonemes, word segmentation, and vocabulary. Progress has been reported in this direction, so far limited to small-vocabulary recognition (Riccardi and Hakkani-Tur, 2003, Park and Glass, 2008, Kamper et al., 2016).

  1. A full-scale test of unsupervised speech recognition could be to train on the audio part of a transcribed speech corpus (e.g., TIMIT (Garofolo, 1993)), then learn to predict the transcriptions with only very sparse supervision.

Strong generalization

Humans can transfer knowledge and skills across situations that share high-level structure but are otherwise radically different, adapting to the particulars of a new setting while preserving the essence of the skill, a capacity that (Tarlow, 2016; Gaunt et al., 2016) refer to as strong generalization. If we learn to clean up a room, we know how to clean up most other rooms.

Tests

  1. A general assembly robot could learn to build a toy castle in one material (e.g., lego blocks) and be tested on building it from other materials (sand, stones, sticks).
  2. A household robot could be trained on cleaning and cooking tasks in one environment and be tested in highly dissimilar environments.

Category learning from few examples

Lake et al. (2015) achieved human-level recognition and generation of characters using few examples. However, learning more complex categories from few examples remains an open problem.

Tests

  1. The ImageNet database (Deng et al., 2009) contains images organized by the semantic hierarchy of WordNet (Miller, 1995). Correctly determining ImageNet categories from images with very little training data could be a challenging test of learning from few examples.

Learning to learn

Learning to learn or meta-learning (e.g., Harlow, 1949; Schmidhuber, 1987; Thrun and Pratt, 1998; Andrychowicz et al., 2016; Chen et al., 2016; de Freitas, 2016; Duan et al., 2016; Lake et al., 2016; Wang et al., 2016) is the acquisition of skills and inductive biases that facilitate future learning. The scenarios considered in particular are ones where a more general and slower learning process produces a faster, more specialized one. An example is biological evolution producing efficient learners such as human beings.

Tests

  1. Learning to play Atari video games is an area that has seen some remarkable recent successes, including in transfer learning (Parisotto et al., 2016). However, there is so far no system that first learns to play video games, then is capable of learning a new game, as humans can, from a few minutes of play (Lake et al., 2016).

Compositional learning

Compositional learning (de Freitas, 2016; Lake et al., 2016) is the ability to recombine primitive representations to accelerate the acquisition of new knowledge. It is closely related to learning to learn.

Tests

Tests for compositional learning need to verify both that the learner is effective and that it uses compositional representations.

  1. Some ImageNet categories correspond to object classes defined largely by their arrangements of component parts, e.g., chairs and stools, or unicycles, bicycles, and tricycles. A test could evaluate the agent’s ability to learn categories with few examples and to report the parts of the object in an image.
  2. Compositional learning should be extremely helpful in learning video games (Lake et al., 2016). A learner could be tested on a game already mastered, but where component elements have changed appearance (e.g., different-looking fish in the Frostbite game). It should be able to play the variant game with little or no additional learning.

Learning without forgetting

In order to learn continually over its lifetime, an agent must be able to generalize over new observations while retaining previously acquired knowledge. Recent progress towards this goal is reported in (Kirkpatrick et al., 2016) and (Li and Hoiem, 2016). Work on memory augmented neural networks (e.g., Graves et al., 2016) is also relevant.

Tests

A test for learning without forgetting needs to present learning tasks sequentially (earlier tasks are not repeated) and test for retention of early knowledge. It may also test for declining learning time for new tasks, to verify that the agent exploits the knowledge acquired so far.

  1. A challenging test for learning without forgetting would be to learn to recognize all the categories in ImageNet, presented sequentially.

Transfer learning

Transfer learning (Pan and Yang, 2010) is the ability of an agent trained in one domain to master another. Results in the area of text comprehension are currently poor unless the agent is given some training on the new domain (Kadlec, et al., 2016).

Tests

Sentiment classification (Blitzer et al., 2007) provides a possible testing ground for transfer learning. Learners can be trained on one corpus, tested on another, and compared to a baseline learner trained directly on the target domain.

  1. Reviews of movies and of businesses are two domains dissimilar enough to make knowledge transfer challenging. Corpora for the domains are Rotten Tomatoes movie reviews (Pang and Lee, 2005) and the Yelp Challenge dataset (Yelp, 2017).

Knowing when you don’t know

While uncertainty is modeled differently by different learning algorithms, it seems to be true in general that current artificial systems are not nearly as good as humans at “knowing when they don’t know.” An example are deep neural networks that achieve state-of-the-art accuracy on image recognition but assign 99.99% confidence to the presence of objects in images completely unrecognizable to humans (Nguyen et al., 2015).

Human performance on confidence estimation would include

  1. In induction tasks, like program induction or sequence completion, knowing when the provided examples are insufficient for induction (multiple reasonable hypotheses could account for them)
  2. In speech recognition, knowing when an utterance has not been interpreted reliably
  3. In visual tasks such as pedestrian detection, knowing when a part of the image has not been analyzed reliably

Tests

  1. A speech recognizer can be compared against a human baseline, measuring the ratio of the average confidence to the confidence on examples where recognition fails.
  2. The confidence of image recognition systems can be tested on generated adversarial examples.

Learning through action

Human infants are known to learn about the world through experiments, observing the effects of their own actions (Smith and Gasser, 2005; Malik, 2015). This seems to apply both to higher-level cognition and perception. Animal experiments have confirmed that the ability to initiate movement is crucial to perceptual development (Held and Hein, 1963) and some recent progress has been made on using motion in learning visual perception (Agrawal et al., 2015). In (Agrawal et al., 2016), a robot learns to predict the effects of a poking action.

“Learning through action” thus encompasses several areas, including

  • Active learning, where the agent selects the training examples most likely to be instructive
  • Undertaking epistemological actions, i.e., activities aimed primarily at gathering information
  • Learning to perceive through action
  • Learning about causal relationships through action

Perhaps most importantly, for artificial systems, learning the causal structure of the world through experimentation is still an open problem.

Tests

For learning through action, it is natural to consider problems of motor manipulation where in addition to the immediate effects of the agent’s actions, secondary effects must be considered as well.

  1. Learning to play billiards: An agent with little prior knowledge and no fixed training data is allowed to explore a real or virtual billiard table and should learn to play billiards well.

Sensorimotor problems

Outstanding problems in robotics and machine perception include:

  1. Autonomous navigation in dynamic environments
  2. Scene analysis
  3. Robust general object recognition and detection
  4. Robust, life-time simultaneous location and mapping (SLAM)
  5. Multimodal integration
  6. Adaptive dexterous manipulation

Autonomous navigation

Despite recent progress in self-driving cars by companies like Tesla, Waymo (formerly the Google self-driving car project) and many others, autonomous navigation in highly dynamic environments remains a largely unsolved problem, requiring knowledge of object semantics to reliably predict future scene states (Ess et al., 2010).

Tests

  1. Fully automatic driving in crowded city streets and residential areas is still a challenging test for autonomous navigation.

Scene analysis

The challenge of scene analysis extends far beyond object recognition and includes the understanding of surfaces formed by multiple objects, scene 3D structure, causal relations (Lake et al., 2016), and affordances. It is not limited to vision but can depend on audition, touch, and other modalities, e.g., electroreception and echolocation (Lewicki et al., 2014; Kondo et al., 2017). While progress has been made, e.g., in recognizing anomalous and improbable scenes (Choi et al., 2012), predicting object dynamics (Fouhey and Zitnick, 2014), and discovering object functionality (Yao et al., 2013), we are still far from human-level performance in this area.

Tests

Some possible challenges for understanding the causal structure in visual scenes are:

  1. Recognizing dangerous situations: A corpus of synthetic images could be created where the same objects are recombined to form “dangerous” and “safe” scenes as classified by humans.
  2. Recognizing physically improbable scenes: A synthetic corpus could be created to show physically plausible and implausible scenes containing the same objects.
  3. Recognizing useless objects: Images of useless objects have been created by (Kamprani, 2017).

Object recognition

While object recognition has seen great progress in recent years (e.g., Han et al., 2016), matches or surpasses human performance for many problems (Karpathy, 2014), and can approach perfection in closed environments (Song et al., 2015), state-of-the-art systems still struggle with the harder cases such as open objects (interleaved with background), broken objects, truncation and occlusion in dynamic environments (e.g., Rajaram et al., 2015).

Tests

Environments that are cluttered and contain objects drawn from a large, open-ended, and changing set of types are likely to be challenging for an object recognition system. An example would be

  1. Seeing photos of the insides of pantries and refrigerators and listing the ingredients available to the owners

Simultaneous location and mapping

While the problem of simultaneous location and mapping (SLAM) is considered solved for some applications, the challenge of SLAM for long-lived autonomous robots, in large-scale, time-varying environments, remains open (Cadena et al., 2016).

Tests

  1. Lifetime location and mapping, without detailed maps provided in advance and robust to changes in the environment, for an autonomous car based in a large city

Multimodal integration

The integration of multiple senses (Lahat, 2015) is important, e.g., in human communication (Morency, 2015) and scene understanding (Lewicki et al., 2014; Kondo et al., 2017). Having multiple overlapping sensory systems seems to be essential for enabling human children to educate themselves by perceiving and acting in the world (Smith and Gasser, 2005).

Tests

Spoken communication in noisy environments, where lip reading and gestural cues are indispensable, can provide challenges for multimodal fusion. An example would be

  1. A robot bartender: The agent needs to interpret customer requests in a noisy bar.

Adaptive dexterous manipulation

Current robot manipulators do not come close to the versatility of the human hand (Ciocarlie, 2015). Hard problems include manipulating deformable objects and operating from a mobile platform.

Tests

  1. Taking out clothes from a washing machine and hanging them on clothes lines and coat hangers in varied places while staying out of the way of humans

Open-ended problems

Some noteworthy problems were omitted from the list for having a too open-ended scope: they encompass sets of tasks that evolve over time or can be endlessly extended. This makes it hard to decide whether a problem has been solved. Problems of this type include

  • Enrolling in a human university and take classes like humans (Goertzel, 2012)
  • Automating all types of human work (Nilsson, 2005)
  • Puzzlehunt challenges, e.g., the annual TMOU game in the Czech republic (TMOU, 2016)

Conclusion

I have reviewed a number of open problems in an attempt to delineate the current front lines of AI research. The problem list in this first version, as well as the problem descriptions, example tests, and mentions of ongoing work in the research areas, are necessarily incomplete. I plan to extend and improve the document incrementally and warmly welcome suggestions either in the comment section below or at the institute’s discourse forum.

Acknowledgements

I thank Jan Feyereisl, Martin Poliak, Petr Dluhoš, and the rest of the GoodAI team for valuable discussion and suggestions.

References

AAAI. “AAAI-13 International general game playing competition.” Online under http://www.aaai.org/Conferences/AAAI/2013/aaai13games.php (2013)

Agrawal, Pulkit, Joao Carreira, and Jitendra Malik. “Learning to see by moving.” Proceedings of the IEEE International Conference on Computer Vision. 2015.

Agrawal, Pulkit, et al. “Learning to poke by poking: Experiential learning of intuitive physics.” arXiv preprint arXiv:1606.07419 (2016).

AI•ON. “The AI•ON collection of open research problems.” Online under http://ai-on.org/projects (2016)

Allamanis, Miltiadis, Hao Peng, and Charles Sutton. “A convolutional attention network for extreme summarization of source code.” arXiv preprint arXiv:1602.03001 (2016).

Andrychowicz, Marcin, et al. “Learning to learn by gradient descent by gradient descent.” Advances in Neural Information Processing Systems. 2016.

Aupperlee, Aaron. “No bluff: Supercomputer outwits humans in poker rematch.” Online under http://triblive.com/local/allegheny/11865933-74/rematch-aaron-aupperlee (2017)

Blitzer, John, Mark Dredze, and Fernando Pereira. “Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification.” ACL. Vol. 7. 2007.

Brachman, Ronald J. “AI more than the sum of its parts.” AI Magazine 27.4 (2006): 19.

Brooks, R., et al. “Challenge problems for artificial intelligence.” Thirteenth National Conference on Artificial Intelligence-AAAI. 1996.

Brown, Noam, and Tuomas Sandholm. “Safe and Nested Endgame Solving for Imperfect-Information Games.” Online under http://www.cs.cmu.edu/~noamb/papers/17-AAAI-Refinement.pdf (2017)

Cadena, Cesar, et al. “Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age.” IEEE Transactions on Robotics 32.6 (2016): 1309–1332.

Chang, Michael B., et al. “A compositional object-based approach to learning physical dynamics.” arXiv preprint arXiv:1612.00341 (2016).

Chen, Yutian, et al. “Learning to Learn for Global Optimization of Black Box Functions.” arXiv preprint arXiv:1611.03824 (2016).

Choi, Myung Jin, Antonio Torralba, and Alan S. Willsky. “Context models and out-of-context objects.” Pattern Recognition Letters 33.7 (2012): 853–862.

Ciocarlie, Matei. “Versatility in Robotic Manipulation: the Long Road to Everywhere.” Online under https://www.youtube.com/watch?v=wiTQ6qOR8o4 (2015)

Commonsense Reasoning. “Commonsense reasoning problem page.” Online under http://commonsensereasoning.org/problem_page.html (2015)

Commonsense Reasoning. “Commonsense reasoning Winograd schema challenge.” Online under http://commonsensereasoning.org/winograd.html (2016a)

Commonsense Reasoning. “Commonsense reasoning pronoun disambiguation problems” Online under http://commonsensereasoning.org/disambiguation.html (2016b)

Davies, Mark. The corpus of contemporary American English. BYE, Brigham Young University, 2008.

Davis, Ernest. “Collection of Winograd schemas.” Online under http://www.cs.nyu.edu/faculty/davise/papers/WinogradSchemas/WSCollection.html (2011)

de Freitas, Nando. “Learning to Learn and Compositionality with Deep Recurrent Neural Networks: Learning to Learn and Compositionality.” Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016.

Degrave, Jonas, Michiel Hermans, and Joni Dambre. “A Differentiable Physics Engine for Deep Learning in Robotics.” arXiv preprint arXiv:1611.01652 (2016).

Deng, Jia, et al. “Imagenet: A large-scale hierarchical image database.” Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 2009.

Denil, Misha, et al. “Learning to Perform Physics Experiments via Deep Reinforcement Learning.” arXiv preprint arXiv:1611.01843 (2016).

Duan, Yan, et al. “RL²: Fast Reinforcement Learning via Slow Reinforcement Learning.” arXiv preprint arXiv:1611.02779 (2016).

Ess, Andreas, et al. “Object detection and tracking for autonomous navigation in dynamic environments.” The International Journal of Robotics Research 29.14 (2010): 1707–1725.

Finn, Chelsea, and Sergey Levine. “Deep Visual Foresight for Planning Robot Motion.” arXiv preprint arXiv:1610.00696 (2016).

Fouhey, David F., and C. Lawrence Zitnick. “Predicting object dynamics in scenes.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014.

Fragkiadaki, Katerina, et al. “Learning visual predictive models of physics for playing billiards.” arXiv preprint arXiv:1511.07404 (2015).

Garofolo, John, et al. “TIMIT Acoustic-Phonetic Continuous Speech Corpus LDC93S1.” Web Download. Philadelphia: Linguistic Data Consortium, 1993.

Gaunt, Alexander L., et al. “Terpret: A probabilistic programming language for program induction.” arXiv preprint arXiv:1608.04428 (2016).

Genesereth, Michael, Nathaniel Love, and Barney Pell. “General game playing: Overview of the AAAI competition.” AI magazine 26.2 (2005): 62.

Goertzel, Ben. “What counts as a conscious thinking machine?” Online under https://www.newscientist.com/article/mg21528813.600-what-counts-as-a-conscious-thinking-machine (2012)

GoodAI. “General AI Challenge.” Online under https://www.general-ai-challenge.org/ (2017)

Graves, Alex, et al. “Hybrid computing using a neural network with dynamic external memory.” Nature 538.7626 (2016): 471–476.

Hamrick, Jessica B., et al. “Imagination-Based Decision Making with Physical Models in Deep Neural Networks.” Online under http://phys.csail.mit.edu/papers/5.pdf (2016)

Han, Dongyoon, Jiwhan Kim, and Junmo Kim. “Deep Pyramidal Residual Networks.” arXiv preprint arXiv:1610.02915 (2016).

Harlow, Harry F. “The formation of learning sets.” Psychological review 56.1 (1949): 51.

Held, Richard, and Alan Hein. “Movement-produced stimulation in the development of visually guided behavior.” Journal of comparative and physiological psychology 56.5 (1963): 872.

Hernández-Orallo, José. “Evaluation in artificial intelligence: from task-oriented to ability-oriented measurement.” Artificial Intelligence Review(2016a): 1–51.

Hernández-Orallo, José, et al. “Computer models solving intelligence test problems: progress and implications.” Artificial Intelligence 230 (2016b): 74–107.

Hernández-Orallo, José. “The measure of all minds.” Cambridge University Press, 2017.

IOCCC. “The International Obfuscated C Code Contest.” Online under http://www.ioccc.org (2016)

Kadlec, Rudolf, et al. “Finding a jack-of-all-trades: an examination of semi-supervised learning in reading comprehension.” Under review at ICLR 2017, online under https://openreview.net/pdf?id=rJM69B5xx

Kamper, Herman, Aren Jansen, and Sharon Goldwater. “Unsupervised word segmentation and lexicon discovery using acoustic word embeddings.” IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP) 24.4 (2016): 669–679.

Kamprani, Katerina. “The uncomfortable.” Online under http://www.kkstudio.gr/#the-uncomfortable (2017)

Karpathy, Andrej. “What I learned from competing against a ConvNet on ImageNet.” Online under http://karpathy.github.io/2014/09/02/what-i-learnedfrom-competing-against-a-convnet-on-imagenet (2014)

Kirkpatrick, James, et al. “Overcoming catastrophic forgetting in neural networks.” arXiv preprint arXiv:1612.00796 (2016).

Kondo, H. M., et al. “Auditory and visual scene analysis: an overview.” Philosophical transactions of the Royal Society of London. Series B, Biological sciences 372.1714 (2017).

Lahat, Dana, Tülay Adali, and Christian Jutten. “Multimodal data fusion: an overview of methods, challenges, and prospects.” Proceedings of the IEEE 103.9 (2015): 1449–1477.

Lake, Brenden M., Ruslan Salakhutdinov, and Joshua B. Tenenbaum. “Human-level concept learning through probabilistic program induction.” Science 350.6266 (2015): 1332–1338.

Lake, Brenden M., et al. “Building machines that learn and think like people.” arXiv preprint arXiv:1604.00289 (2016).

LeCun, Yann. “The Next Frontier in AI: Unsupervised Learning.” Online under http://www.ri.cmu.edu/event_detail.html?event_id=1211&&menu_id=242&event_type=seminars (2016)

Lewicki, Michael S., et al. “Scene analysis in the natural environment.” Frontiers in psychology 5 (2014): 199.

Li, Wenbin, Aleš Leonardis, and Mario Fritz. “Visual stability prediction and its application to manipulation.” arXiv preprint arXiv:1609.04861 (2016).

Li, Zhizhong, and Derek Hoiem. “Learning without forgetting.” European Conference on Computer Vision. Springer International Publishing, 2016.

Loebner, Hugh. “Home page of the Loebner prize-the first Turing test.” Online under http://www.loebner.net/Prizef/loebner-prize.html (2016).

Lovett, Andrew, and Kenneth Forbus. “Modeling visual problem solving as analogical reasoning.” Psychological Review 124.1 (2017): 60.

Malik, Jitendra. “The Hilbert Problems of Computer Vision.” Online under https://www.youtube.com/watch?v=QaF2kkez5XU (2015)

McCarthy, John. “An example for natural language understanding and the AI Problems it raises.” Online under http://www-formal.stanford.edu/jmc/mrhug/mrhug.html (1976)

McLeod, John. “Card game rules — card games and tile games from around the world.” Online under https://www.pagat.com (2017)

Miller, George A. “WordNet: a lexical database for English.” Communications of the ACM 38.11 (1995): 39–41.

Mottaghi, Roozbeh, et al. ““What happens if…” Learning to Predict the Effect of Forces in Images.” European Conference on Computer Vision. Springer International Publishing, 2016.

Morency, Louis-Philippe. “Multimodal Machine Learning.” Online under https://www.youtube.com/watch?v=pMb_CIK14lU (2015)

Nair, Ashvin, et al. “Combining Self-Supervised Learning and Imitation for Vision-Based Rope Manipulation.” Online under http://phys.csail.mit.edu/papers/15.pdf (2016)

Nguyen, Anh, Jason Yosinski, and Jeff Clune. “Deep neural networks are easily fooled: High confidence predictions for unrecognizable images.” 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2015.

Nilsson, Nils J. “Human-level artificial intelligence? Be serious!.” AI magazine 26.4 (2005): 68.

OpenAI. “Requests for research.” Online under https://openai.com/requests-for-research (2016)

Pan, Sinno Jialin, and Qiang Yang. “A survey on transfer learning.” IEEE Transactions on knowledge and data engineering 22.10 (2010): 1345–1359.

Pang, Bo, and Lillian Lee. “Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales.” Proceedings of the 43rd annual meeting on association for computational linguistics. Association for Computational Linguistics, 2005.

Parisotto, Emilio, Jimmy Lei Ba, and Ruslan Salakhutdinov. “Actor-mimic: Deep multitask and transfer reinforcement learning.” arXiv preprint arXiv:1511.06342 (2015).

Park, Alex S., and James R. Glass. “Unsupervised pattern discovery in speech.” IEEE Transactions on Audio, Speech, and Language Processing 16.1 (2008): 186–197.

Rajaram, Rakesh Nattoji, Eshed Ohn-Bar, and Mohan M. Trivedi. “An exploration of why and when pedestrian detection fails.” 2015 IEEE 18th International Conference on Intelligent Transportation Systems. IEEE, 2015.

Riccardi, Giuseppe, and Dilek Z. Hakkani-Tür. “Active and unsupervised learning for automatic speech recognition.” Interspeech. 2003.

Robo chat challenge. “Robo chat challenge 2014.” Online under http://www.robochatchallenge.com (2014)

Rosa, Marek, Jan Feyereisl, and The GoodAI Collective. “A Framework for Searching for General Artificial Intelligence.” arXiv preprint arXiv:1611.00685 (2016).

Schmidhuber, Jurgen. “Evolutionary principles in self-referential learning.” On learning how to learn: The meta-meta-… hook.) Diploma thesis, Institut f. Informatik, Tech. Univ. Munich (1987).

Silver, David, et al. “Mastering the game of Go with deep neural networks and tree search.” Nature 529.7587 (2016): 484–489.

Smith, Linda, and Michael Gasser. “The development of embodied cognition: Six lessons from babies.” Artificial life 11.1–2 (2005): 13–29.

Song, Shuran, Linguang Zhang, and Jianxiong Xiao. “Robot in a room: Toward perfect object recognition in closed environments.” CoRR (2015).

Stewart, Russell, and Stefano Ermon. “Label-free supervision of neural networks with physics and domain knowledge.” arXiv preprint arXiv:1609.05566 (2016).

Tarlow, Daniel. “In Search of Strong Generalization.” Online under https://uclmr.github.io/nampi/talk_slides/tarlow-nampi.pdf (2016)

Thrun, Sebastian, and Tom M. Mitchell. “Lifelong robot learning.” Robotics and autonomous systems 15.1–2 (1995): 25–46.

Thrun, Sebastian, and Lorien Pratt. “Learning to learn: Introduction and overview.” Learning to learn. Springer US, 1998. 3–17.

TMOU. “Archiv TMOU.” Online under http://www.tmou.cz/archiv/index (2016)

Verschae, Rodrigo, and Javier Ruiz-del-Solar. “Object detection: current and future directions.” Frontiers in Robotics and AI 2 (2015): 29.

Wang, Jane X., et al. “Learning to reinforcement learn.” arXiv preprint arXiv:1611.05763 (2016).

Yao, Bangpeng, Jiayuan Ma, and Li Fei-Fei. “Discovering object functionality.” Proceedings of the IEEE International Conference on Computer Vision. 2013.

Yelp, “The Yelp Dataset Challenge.”, online under https://www.yelp.com/dataset_challenge (2017)


Unsolved Problems in AI was originally published in AI Roadmap Institute Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Roadmap Comparison at GoodAI

Guest post by Martin Stránský, Research Scientist @GoodAI

Figure 1. GoodAI architecture development roadmap comparison (full-size)

Recent progress in artificial intelligence, especially in the area of deep learning, has been breath-taking. This is very encouraging for anyone interested in the field, yet the true progress towards human-level artificial intelligence is much harder to evaluate.

The evaluation of artificial intelligence is a very difficult problem for a number of reasons. For example, the lack of consensus on the basic desiderata necessary for intelligent machines is one of the primary barriers to the development of unified approaches towards comparing different agents. Despite a number of researchers specifically focusing on this topic (e.g. José Hernández-Orallo or Kristinn R. Thórisson to name a few), the area would benefit from more attention from the AI community.

Methods for evaluating AI are important tools that help to assess the progress of already built agents. The comparison and evaluation of roadmaps and approaches towards building such agents is however less explored. Such comparison is potentially even harder, due to the vagueness and limited formal definitions within such forward-looking plans.

Nevertheless, we believe that in order to steer towards promising areas of research and to identify potential dead-ends, we need to be able to meaningfully compare existing roadmaps. Such comparison requires the creation of a framework that defines processes on how to acquire important and comparable information from existing documents outlining their respective roadmaps. Without such a unified framework, each roadmap might not only differ in its target (e.g. general AI, human-level AI, conversational AI, etc…) but also in its approaches towards achieving that goal that might be impossible to compare and contrast.

This post offers a glimpse of how we, at GoodAI, are starting to look at this problem internally (comparing the progress of our three architecture teams), and how this might scale to comparisons across the wider community. This is still very much a work-in-progress, but we believe it might be beneficial to share these initial thoughts with the community, to start the discussion about, what we believe, is an important topic.

Overview

In the first part of this article, a comparison of three GoodAI architecture development roadmaps is presented and a technique for comparing them is discussed. The main purpose is to estimate the potential and completeness of plans for every architecture to be able to direct our effort to the most promising one.

To manage adding roadmaps from other teams we have developed a general plan of human-level AI development called a meta-roadmap. This meta-roadmap consists of 10 steps which must be passed in order to reach an ‘ultimate’ target. We hope that most of the potentially disparate plans solve one or more problems identified in the meta-roadmap.

Next, we tried to compare our approaches with that of Mikolov et. al by assigning the current documents and open tasks to problems in the meta-roadmap. We found that useful, as it showed us what is comparable and that different techniques of comparison are needed for every problem.

Architecture development plans comparison

Three teams from GoodAI have been working on their architectures for a few months. Now we need a method to measure the potential of the architectures to be able to, for example, direct our effort more efficiently by allocating more resources to the team with the highest potential. We know that determining which way is the most promising based on the current state is still not possible, so we asked the teams working on unfinished architectures to create plans for future development, i.e. to create their roadmaps.

Based on the provided responses, we have iteratively unified requirements for those plans. After numerous discussions, we came up with the following structure:

  • A Unit of a plan is called a milestone and describes some piece of work on a part of the architecture (e.g. a new module, a different structure, an improvement of a module by adding functionality, tuning parameters etc.)
  • Each milestone contains — Time Estimate, i.e. expected time spent on milestone assuming current team size, Characteristic of work or new features and Test of new features.
  • A plan can be interrupted by checkpoints which serve as common tests for two or more architectures.

Now we have a set of basic tools to monitor progress:

  • We will see whether a particular team will achieve their self-designed tests and thereby can fulfill their original expectations on schedule.
  • Due to checkpoints it is possible to compare architectures in the middle of development.
  • We can see how far a team sees. Ideally after finishing the last milestone, the architecture should be prepared to pass through a curriculum (which will be developed in the meantime) and a final test afterwards.
  • Total time estimates. We can compare them as well.
  • We are still working on a unified set (among GoodAI architectures) of features which we will require from an architecture (desiderata for an architecture).

The particular plans were placed side by side (c.f. Figure 1) and a few checkpoints were (currently vaguely) defined. As we can see, teams have rough plans of their work for more than one year ahead, still the plans are not complete in a sense that the architectures will not be ready for any curriculum. Two architectures use a connectivist approach and they are easy to compare. The third, OMANN, manipulates symbols, thus from the beginning it can perform tasks which are hard for the other two architectures and vice versa. This means that no checkpoints for OMANN have been defined yet. We see a lack of common tests as a serious issue with the plan and are looking for changes to make the architecture more comparable with the others, although it may cause some delays with the development.

There was an effort to include another architecture in the comparison, but we have not been able to find a document describing future work in such detail, with the exception of Weston’s et al. paper. After further analysis, we determined that the paper was focused on a slightly different problem than the development of an architecture. We will address this later in the post.

Assumptions for a common approach

We would like to take a look at the problem from the perspective of the unavoidable steps required to develop an intelligent agent. First we must make a few assumptions about the whole process. We realize that these are somewhat vague — we want to make them acceptable to other AI researchers.

  1. A target is to produce a software (referred to as an architecture), which can be a part of some agent in some world.
  2. In the world there will be tasks that the agent should solve, or a reward based on world states that the agent should seek.
  3. An intelligent agent can adapt to an unknown/changing environment and solve previously unseen tasks.
  4. To check whether the ultimate goal was reached (no matter how defined), every approach needs some well defined final test, which shows how intelligent the agent is (preferably compared to humans).

Before the agent is able to pass their final test, there must be a learning phase in order to teach the agent all necessary skills or abilities. If there is a possibility that the agent can pass the final test without learning anything, the final test is insufficient with respect to point 3. Description of the learning phase (which can include also a world description) is called curriculum.

Meta-roadmap

Using the above assumptions (and a few more obvious ones which we won’t enumerate here) we derive Figure 2 describing the list of necessary steps and their order. We call this diagram a meta-roadmap.

Figure 2. Overview of a meta-roadmap (full-size)

The most important and imminent tasks in the diagram are

  • The definition of an ultimate target,
  • A final test specification,
  • The proposed design of a curriculum, and
  • A roadmap for the development of an architecture.

We think that the majority of current approaches solve one or more of these open problems; from different points of view according to an ultimate target and beliefs of authors. In order to make the effort more clear, we will divide approaches described in published papers into groups according to the problem that they solve and compare them within those groups. Of course, approaches are hard to compare among groups (yet it is not impossible, for example final test can be comparable to a curriculum under specific circumstances). Even within one group it can be very hard in some situations, where requirements (which are the first thing that should be defined according to our diagram) differ significantly.

Also an analysis of complexity and completeness of an approach can be made within this framework. For example, if a team omits one or more of the open problems, it indicates that the team may not have considered that particular issue and are proceeding without a complete notion of the ‘big picture’.

Problem assignment

We would like to show an attempt to assign approaches to problems and compare them. First, we have analyzed GoodAI’s and Mikolov/Weston’s approach as the latter is well described. You can see the result in Figure 3 below.

Figure 3. Meta-roadmap with incorporated desiderata for different roadmaps (full-size)

As the diagram suggests, we work on a few common problems. We will not provide the full analysis here, but will make several observations to demonstrate the meaningfulness of the meta-roadmap. In desiderata, according to Mikolov’sA Roadmap towards Machine Intelligence”, a target is an agent which can understand human language. In contrast with the GoodAI approach, other modalities than text are not considered as important. In the curriculum, GoodAI wants to teach an agent in a more anthropocentric way — visual input first, language later — while the entirety of Weston’s curriculum comprises of language-oriented tasks.

Mikolov et al. do not provide a development plan for their architecture, so we can compare their curriculum roadmap to ours, but it is not possible to include their desiderata into the diagram in Figure 1.

Conclusion

We have presented our meta-roadmap and a comparison of three GoodAI development roadmaps. We hope that this post will offer a glimpse into how we started this process at GoodAI and will invigorate a discussion on how this could be improved and scaled beyond internal comparisons. We will be glad to receive any feedback — the generality of our meta-roadmap should be discussed further, as well as our methods for estimating roadmap completeness and their potential to achieve human-level AI.


Roadmap Comparison at GoodAI was originally published in AI Roadmap Institute Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Why businesses must get ready for the era of robotic things

Internet_of_Things_Smart_phone_IoT_conveyor_Belt_industry_factory

We’ve entered a period of epic technological transformation that is impacting society in ways that are leaving even veteran tech observers speechless. In some ways, it might seem like 1998 all over again. The internet was then in its infancy and cyberspace was uncharted territory for much of the population. The Dot-com boom and eventual bust was inevitable and reflected the markets expanding and contracting with the newfound surge of interest, but obviously overinflated speculation, around the potential of the Internet to transform society. Following a tremendous growth spurt in the first 5-6 years, the World Wide Web ended its first decade by learning to become sociable.

Indeed, the rise of the digital social network through channels like Facebook, Twitter, Instagram, and Snapchat have transformed how businesses and individuals work, think, and communicate. And now that the Internet has crossed the threshold into young adulthood, it’s continuing to grow and drive massive transformations throughout all parts of business and society. Soon-to-be-released autonomous cars, wearable tech, drones, 3-D printing, smart machines, home automation, virtual assistants like Siri, you name it . . . the pace of change is staggering.

Graphic by Predikto on company blog.
Graphic by Predikto on company blog.

Many of the breakthroughs and innovations we’re seeing right now are a result of three major confluences over the past 8 years: mobile, cloud, and Big Data. These technologies have collectively resulted in quicker and more efficient means of collaboration, development, and production, which in turn have allowed businesses to achieve unprecedented levels of growth and expansion. New ways of working, often remotely, mean that more people can do their jobs outside of traditional corporate structures. We’ve entered not only the freelancer economy, but the startup one as well. Now everyone is an entrepreneur and innovation and new business growth is through the roof. What’s more is that technology processes and production cycles have become commoditized. This means that anyone with the right idea, system, skills, and network in place today can effectively build a billion dollar business with very low overhead costs.

This crazy rate of change is certainly great from the consumer standpoint, but also a bit unnerving for businesses simply trying to keep their head above the water. How are companies today to keep up with the market, their competitors, and with consumer expectations? What does all this change mean for startups struggling to gain traction, for more traditional brick and mortar businesses, and even for established enterprises that might be too big to pivot quickly?

These are more comprehensive questions that we’ll try to answer in another blog post. But for now, here is what we do know about the massive impacts over the next few years. The first thing is that Internet of Things is taking the world by storm with projections that 21 billion objects will be connected by the year 2020. That’s just about 3 for every man, woman, and child on the planet! A few years ago Cisco estimated that the IoT market would create $19 trillion of economic value in the next decade.

What’s more is that the global robotics industry is also undergoing a major transformation. Market intelligence firm Tractica released a report in November 2015 forecasting that global robotics will grow from $28.3 billion worldwide in 2015 to $151.7 billion by 2020. What’s especially significant is that this market share will encompass mostly non-industrial robots, including segments like consumer, enterprise, medical, military, UAVs, and autonomous vehicles. Tractica anticipates an increase in annual robots shipments from 8.8 million in 2015 to 61.4 million by 2020; in fact, by 2020 over half of this volume will come from consumer robots.

robot_device-676x507

Putting together the two major industry trends, it doesn’t take rocket science to figure out that the two industries – Internet of Things and Robotics – will together lead to a “perfect storm” of global market disruption, opportunities, and growth in the next 4 years and beyond. This confluence is part of a larger epic transformation, which has appropriately been called the Second Machine Age. Listen to how this FastCompany article sums it up:

The fact is we’re now on the cusp of a “Second Machine Age,” one powered not by clanging factory equipment but by automation, artificial intelligence, and robotics. Self-driving cars are expected to be widespread in the coming decade. Already, automated checkout technology has replaced cashiers, and computerized check-in is the norm at airports. Just like the Industrial Revolution more than 200 years ago, the AI and robotics revolution is poised to touch virtually every aspect of our lives—from health and personal relations to government and, of course, the workplace.

This is a mouthful but in case it’s not clear, let me spell it out: there’s never been a better time than now to get onboard with robotics and Internet of Things!

If you’re a startup or small business owner, and especially feeling behind the technology curve, you’re certainly not alone. But instead of commiserating about all of the changes, proactively start today to ask yourself what it will take to get your organization to the next level of innovation. Set yourself up with a 6 month, 12 month, 18 month and 2 year innovation plan which maps to a broader 2020 strategy. Time is of the essence but it’s not too late to pivot and get onboard with the robotics and IoT revolution. As the famous statement goes, “The journal of a thousand miles starts with one step.”

Page 431 of 431
1 429 430 431