Human action recognition (HAR) has a considerable place in scientific studies. Additionally, hand gesture recognition, which is a subcategory of HAR, plays an important role in communicating with deaf people. Convolutional neural network (CNN) structures are frequently used to recognize human actions. In the study, hyperparameters of the CNN structures, which are based on AlexNet model, are optimized by heuristic optimization algorithms. The proposed method is tested on sign language digits and Thomas Moeslund's gesture recognition datasets. Due to using heuristic algorithms, training procedures are repeated 30 times for both datasets. According to the experimental results, the average accuracy performance for action classification of the proposed artificial bee colony-based method is 98.40%, which is better than the performance of the existing work (with accuracy of 94.2%) for sign language digits dataset. Concurrently, for Thomas Moeslund's gesture recognition dataset, the proposed approach has an average accuracy performance of 98.09%, outperforming the best existing work (which reported 94.33% classification performance).